Pfizer strives to protect the health and safety of workers, the communities in which it operates, and the global environment. Within the company, the Global Environment Health and Safety (EHS) department is responsible for Material Hazard Safety Communication (MHSC) and for setting standards for worker safety and environmental safety. The Pharmacokinetics, Dynamics & Metabolism (PDM) department, among other responsibilities, assesses the discrete toxicity of compounds used throughout the company.
Global EHS saw an opportunity to improve management and reliability of data used for the communication of hazard and risk assessments. Environmental and worker safety research reports could not be easily shared across the organization due to several problems. The fact that reports were saved in multiple formats, from Excel spreadsheets and databases to pdfs, made it very cumbersome to compare and consolidate outcome data. Also, while a wealth of rich data was stored, they were unstructured. This meant that there was a lack of consistency in naming conventions used to store these data which resulted in data duplication. The situation was made even more difficult because users of these data would have to research extensively through existing data that resided in multiple sources. Too often scientists duplicated research efforts because they missed important relevant data that was available.
Based on a thorough review of current practices and data used, Redshift provided a simple “open and flexible” container approach to support Pfizer’s unique data ontology. This allowed for the agile development of common data categories and areas by subject matter experts and specialized views for the data community. The first step Redshift employed to address a lack of common data standards, was to define common naming conventions. These then became the root level from which all data would then be defined. Redshift worked with Pfizer to develop and define common data categories and areas so that unstructured data could be harmonized and reside in usable families.
As an example, to avoid duplication and errors, a single name was identified for each compound root and synonyms were linked with the appropriate name. Compound research documentation was then transferred from disparate locations into the Redshift Profiler, and assessment outcome data (metadata) was extracted from gathered research. Data was formatted following standard federation rules where available, as is the case for chemistry. In cases such as worker safety, where standard formats were not available, rules and formats were created.
Intricate logic models were programmed to ensure the data would be analyzed, merged and compared accurately and insightfully. The user interface and the reporting options were programmed to ensure usability and accurate insight into compound data. Finally, privacy and security requirements were programmed to ensure appropriate access to data.