What are Data Silos?
A data silo is a collection of information in an organization that is isolated from and not accessible by other parts of the organization. Removing data silos can help you get the right information at the right time so you can make good decisions. And, you can save money by reducing storage costs for duplicate information.
How do data silos occur?
Data silos happen for three common reasons:
- Company culture: Often departments are siloed from each other, especially in larger companies. Sometimes this occurs because there is internal competition, but often it happens because one department sees itself as separate from another and doesn’t consider where information should be shared.
- Organizational structures: Unless an organization specifically works to integrate different departments, it’s easy to build layers of hierarchy and management that deter departments from sharing information.
- Technology: It’s not uncommon for different departments to use different technology, making it difficult for the departments to share common information. For example, maybe the Sales team uses Salesforce, but the Marketing team doesn’t have this tool. Yet, it might contain valuable information that the Marketing team could use. An IT survey showed that most companies have between 1-200 applications for their different departments. Consider how unwieldy it can be to find information when you have so many different sources.
Why are data silos a problem?
Data silos are a problem for three primary reasons:
- Inability to get a comprehensive view of data. If your data is siloed, relevant connections between siloed data can easily be missed. Suppose, for example, the Marketing team has excellent data on which Marketing campaigns attracted a lot of attention in a particular geography, whereas the Sales team has information about sales in that same geography. What if you could bring that information together? Imagine how much clearer the relationship between Marketing campaigns and sales would be.
- Wasted resources. Consider what happens if you have a database with customer information for the Marketing team and a separate one for the Sales team. Much data is duplicated between these departments. It costs money to store all this data, and the more data a company stores, the less the organization can spend on other requirements.
- Inconsistent data. In data silos, it’s common to store the same information in different places. When this happens, there is a high chance that you will introduce data inconsistencies. You might update a customer address in one place and not another. Or, you might introduce a typo in one set of information. When the data is in one place, you have a much better chance of maintaining the correct information.
Challenges in dealing with siloed data
While many companies recognize that data silos are a problem, undoing them can be a challenge. Once you have an entrenched culture of separating data, it’s a challenge to change the mindset of employees. Additionally, it may be difficult to undo some of the siloes due to the way that systems are set up with various permission and hierarchies. For example, permissions are often set up by group, so once the data is siloed for a group, it’s hard to then change all the necessary permissions. And if the data is siloed in different systems (for example, data for the Security Operations group is stored in an Oracle database, but the Sales information is in Salesforce), it’s even harder to reconcile the silos. To simplify this process, most companies move their data from their various systems into a data warehouse. A data warehouse is a repository for all data collected by an enterprise’s operational systems. Data warehouses are optimized for access and analysis rather than transactional processing, and they are designed to help management get a 360 view of their company’s data.
Ways to break down data silos
The best way to remove data silos is to consolidate your data into a data warehouse. Here are a few different methods a company might use to get data into a data warehouse:
- Scripting. Some companies use scripts (written in SQL or Python, etc.) to write the code to extract the data and move it to a central location. This can be time-consuming however, and it also requires a great deal of expertise.
- On-premise ETL tools. ETL (Extract, Transform, Load) tools can take much of the pain out of moving data by automating the process. They extract the data from your source, perform transformations, and then load the data to the target data warehouse. These tools are typically hosted on your company’s site.
- Cloud-based ETL tools. These ETL tools are hosted in the cloud, where you can leverage the expertise and infrastructure of the vendor. They are commonly used when a company decides to move siloed data to a cloud data warehouse.