Contact ITInvolve


Whether you define big data as analysis of extremely large data volumes, analysis of complex data sets, high velocity data analysis, a combination of these three, or something different, you probably have a big data project underway in your IT organization right now.

Your big data environment likely contains several newer technologies like Hadoop and a NoSQL database such as MongoDB. It probably also includes a hodgepodge of other technologies – some commercial and some open source, e.g. Storm for stream processing, Dremel for ad hoc querying, Gremlin for graph analysis, and perhaps SAP Hana if they’re your ERP provider.

In Our Experience

Once a big data project gets underway, senior business leaders want it to move fast so they can start reaping the benefits. Your project team is probably working longer than normal hours to build, scale up, test, and deploy those big data services for your business, and that’s great. Big data is one area where IT can really demonstrate its impact on the business’ bottom line.

  • The speed of big data projects often means the knowledge you are accumulating around how to build and manage big data services, including their dependencies and interconnections with application data sources, isn’t getting documented very well
  • Once big data services are available, and the business falls in love with using them, making changes to the supporting infrastructure could break them – putting your IT organization in a highly-exposed, negative light
  • Making changes in the application environments that feed your big data services can also produce unexpected ripple effects and big data service interruptions
  • Big data projects typically center on sensitive customer information so understanding how regulations and policies govern your big data infrastructure is also critical to avoiding audit findings and extended remediation efforts

The real big data challenge is not just being able to move fast, but ensuring you can modify and adapt your big data services and supporting infrastructure while also ensuring a state of continuous compliance for the data you are using.

How ITinvolve can help

  1. Start by leveraging the big data environment information you already have (even if it’s not complete or entirely accurate)
  2. Bring that data together visually in one place (either importing or federating the details as necessary) and model the dependencies and relationships in your big data environment and the application environments that data is sourced from
  3. Leverage modern social collaboration principles to follow and peer review this existing knowledge about your big data configurations and dependencies
  4. Quickly fill in any gaps with the undocumented tribal knowledge that’s been accumulated by your project team
  5. Keep the documentation effort going by continuing to feed information from 3rd party configuration tools and updates from project team members (we keep everyone informed and engaged as new information is added)
  6. Once your big data services are ready, we ensure change planners and approval boards have access to the configuration and dependency information necessary to identify potential risks and avoid them to ensure continuous operations and compliance


Watch the How to Videos »



Fully understand relationships and
dependencies for big data environments
(including policy compliance requirements)