Monday, December 14, 2009

Data Quality Management versus Improvement - Just Semantics?

Data Quality Management versus Data Quality Improvement...what's the difference?

Maybe it's just me, but it seems that Data Quality Management is a term used in cases where Data Quality Improvement is more appropriate. In fact, Data Quality Management appears to be a universal term for all things dealing with data quality.

Management of data quality implies that some level of quality exists that needs to be managed. The quality may get better or deteriorate, but "management" suggests that efforts are aimed at keeping data quality at or above certain quality levels. These efforts may include fixing or cleaning up data, but the implication of the term "management" is maintenance.

Improvement of data quality implies that data is not at a desired level of quality and needs to be improved. These efforts could include assessing the current state of critical data, formulating improvement programs with targeted results (e.g., cut the error rate by 50% within 90 days), and implementing those programs. The term "improvement" conjures up a mix of changes aimed solely at making data quality better than it is.

Why should we care? I think the distinction is important because an organizational program labeled as "Data Quality Management" can cause certain paths to be taken. If the program is or should be around quality improvement, it is much more difficult to get out of data maintenance and reorient efforts around improvement and prevention.

Data quality management is ideal for IT organizations because IT is good at putting technologies in place to maintain the data. Leaving data quality to IT puts accountability for the quality of data with those who administer it. Plenty has been written and put into practice around reinforcing accountability for data quality with those who create it...usually the business. This makes sense because the creators and consumers of data have the most skin in the game when it comes to data quality. Since improving quality is usually a recipe of people, process, and technology changes, the business is probably better suited to drive a quality improvement effort. While it is not a universal truth, odds are that leaving data quality to IT will result in lots of technology being thrown at the problems, but data creators will still create poor quality data.

It seems to me that this is much more than semantics. The implications of labeling efforts around data quality inaccurately could steer organizations in less-than-desirable data quality directions. If an organization wants to maintain its data, then "data quality management" seems appropriate. However, if data quality goes beyond maintenance with a focus around improvement, then "data quality improvement" makes more sense to me. Data quality management versus improvement - just semantics? I think not.

Thursday, January 15, 2009

Full-Spectrum Data Quality Management...What Is It?

Most people agree that data quality is not just an IT issue, but one belonging to the enterprise. That implies that data quality issues and participation in resolving them crosses organizational silos. That certainly sounds like it could be full-spectrum data quality, but is it?

Data quality management left solely to IT often results in approaches that start from the database layer and work their way out into the business. Organizations where IT is not tightly integrated with other parts of the business tend to stop once the application layer is reached, assuming that problems rooted in people and process are management issues for the business to tackle.

Data profiling by IT is an exercise of evaluating data values in databases relative to pre-determined quality thresholds, which may not focus attention on the data that matters most to the organization. Data quality improvement approaches often include automation (e.g., batch data cleansing, “scripts” that change values across the database, etc.). Error prevention strategies often involve edits and other controls in front-end systems and data validation routines that run in the background.

Contrast that with data quality management being sponsored by a non-IT part of the business. Data quality approaches in this scenario often start from the customer perspective (internal or external) and work inward. These approaches tend to yield data quality issues that customers see, but also tend to only go inward as far as the application layer.

Data profiling by the business is more like a data quality audit in terms of errors exposed to customers—errors in data which are often most important to the organization. Data quality improvement approaches in this scenario include many of the non-IT variables: training, job aids, process reengineering, workflow/procedure adjustments, etc. At times, controls in applications are also considered in improvement strategies, but the business solutions to these issues often rely on changes in culture and the fabric of the organization.

It seems that a full-spectrum approach to data quality improvement is one that focuses on prevention and is bi-directional—that is, it comes at data issues both from the customers’ perspective and from the database layer. These vectors intersect in the front-end systems making the connection between people, process, and technology more solid.

The first step in implementing a full-spectrum data quality improvement approach is to utilize cross-functional teams for root-cause analysis, design of prevention strategies, and their ultimate implementation. Since quality can be influenced by changes to people, process and technology, these teams must include a full-spectrum of individuals (including IT).
Organizations that only make IT responsible for data quality management run the risk of not improving data that is most important to the organization and only going so far as technology systems for improvement. Organizations that only manage data quality outside of IT run the risk of too much reliance on people and processes while not leveraging the heavy-lifting value IT brings to the table. It makes much more sense to continuously improve data from a technology perspective while also improving it from a business perspective…a full-spectrum approach to data quality improvement.

Friday, October 31, 2008

Look deeper. Data is likely the problem!

I used to think that I was overly sensitive to data issues given my profession. I can't help but trace almost all issues back to data problems. The fact that we're in the Information Age might explain why data problems seem to be at the heart of many issues. Nonetheless, look a bit deeper and you'll see that data is often the common denominator.

We had a case locally where a bus, that was too tall for a footbridge overpass, tried to drive under it anyway. You can guess the result (see: http://seattlepi.nwsource.com/local/359497_bus18.html). A cursory review of the circumstances shows that data problems were at the heart of this issue. For example, the GPS unit the driver was using is programmable for car, bus, or motorcycle. The driver set it for "bus". The expectation/assumption from the driver was that the GPS unit would know if a route contained hazards that were specific to buses. It seems like a fair assumption given the choice of settings. Despite the indications of the bus height (within the bus) and the overpass height (big yellow sign on the side of the overpass), the driver tried to drive under the footbridge. The lack of GPS data about the footbridge as a hazard to buses contributed to this issue.

Luckily, the bus accident did not result in any serious injuries. There have been other data problems at the root of much more serious issues, however, especially in the medical field. Pharmacy errors are the subject of news stories all too frequently, and some of those errors end tragically. One doesn't have to do much more than search the Internet for data error stories to find daily occurrences of issues where data is a contributing cause. This isn't a case of someone being overly sensitive to data issues and seeing them because they're top-of-mind. The next time you encounter a problem, look deeper. Data is likely the problem!

Wednesday, October 29, 2008

Data Quality - A Case of Common Courtesy and Consideration

The term "data" often conjures up images of computer rooms and technicians. Although data are the building blocks for information, knowledge, wisdom, and enlightenment, they have become common elements of our everyday lives. Technology accelerated the creation and consumption of information, moving us squarely into the Information Age. Clearly, when the quality of data and information isn't adequate, our everyday lives can be adversely impacted. So, what does all this have to do with common courtesy?

What happens when someone doesn't put the lid down tightly on the garbage can? The next person to take out the garbage can be faced with a huge mess to clean up. What about when someone fails to replenish the paper towels when they run out? It may be that the next person who has to immediately address a major spill won't have the paper towels to do the job. Those scenarios are pretty common and the recipient of the garbage mess or the lack of paper towels is likely thinking, "if only so-and-so had...".

The idea is that if the people not putting the lid on tight or using up the last of the paper towels had been thinking about, aware of, or cared about what people's experiences would be like who would inherit their mess or lack of paper towels, there wouldn't be a mess and the roll of paper towels would be there for the next calamity. Common courtesy suggests that people consider others who come after them and ensure those people are set up to succeed. That kind of consideration requires understanding what the needs of others are and doing things to ensure those needs are met.

Data quality is no different. True data quality improvement comes from behavior changes in people. Technology plays a part, for sure, but if people begin to understand who uses the data they create and how to create the data in a way that sets the next people up for success, the overall state of data will improve...simply because people change the way they think about data and about others who use it.

Do you know who uses the data you create and are you doing what's necessary to set those people up for success?