Science
Data Manager
Last updated
Data Managers in clinical research design and maintain the electronic data capture systems used in clinical trials, establish data quality standards, manage data validation programming, and oversee the process of cleaning and locking trial databases prior to statistical analysis. They work closely with biostatistics, clinical operations, and regulatory teams to ensure that study data is complete, accurate, and submission-ready.
Role at a glance
- Typical education
- Bachelor's degree in life science, CS, health informatics, or statistics
- Typical experience
- Not specified
- Key certifications
- CCDM (Certified Clinical Data Manager)
- Top employer types
- Biotechnology companies, Contract Research Organizations (CROs), pharmaceutical companies
- Growth outlook
- Sustained demand driven by escalating FDA expectations for CDISC-compliant data submissions
- AI impact (through 2030)
- Augmentation — AI-assisted query generation and automated anomaly detection reduce manual cleaning workloads, shifting the role toward overseeing AI-flagged patterns and resolving complex issues algorithms cannot handle.
Duties and responsibilities
- Design and build electronic data capture (EDC) databases in Medidata Rave, Oracle InForm, or Veeva Vault based on protocol specifications
- Develop data management plans (DMPs) defining data collection standards, validation rules, and quality control procedures
- Create edit checks and validation programs to detect out-of-range values, inconsistencies, and missing data during trial conduct
- Lead the user acceptance testing (UAT) process for EDC databases, coordinating sign-off from clinical, medical, and biostatistics teams
- Manage ongoing data cleaning: generate query listings, route queries to sites, and track resolution through database lock
- Execute data reconciliation between EDC, external lab data, eCOA, and safety databases
- Prepare and execute the database lock process: final query resolution, missing data reconciliation, data audit trail review
- Coordinate with biostatistics on CDASH/SDTM dataset preparation and ADAM derivations for regulatory submissions
- Develop coding conventions and manage medical coding in MedDRA and WHO Drug dictionaries
- Support regulatory inspections: prepare data management documentation, respond to health authority data questions, and manage audit trail requests
Overview
Data Managers are responsible for the integrity of clinical trial data from first entry through regulatory submission. They build the EDC systems that sites use to enter data, write the validation rules that catch errors as they happen, manage the cleaning process that resolves discrepancies, and execute the database lock that hands clean data to biostatistics.
EDC database development is the foundational technical work. Building a clinical database for a complex Phase III trial involves translating a 150-page protocol into structured forms, field definitions, controlled terminology, and edit checks. Design decisions made during database build — how a date field is formatted, whether a required field has a missing data code option, how adverse event severity is coded — affect data quality for the entire two-to-four-year life of the study.
Data cleaning is the ongoing daily work during trial conduct. Edit checks flag anomalous values automatically; queries go to sites for resolution; responses come back and the query closes or requires further discussion. In a 50-site global study, the query workload can run into hundreds of open items at any point. Data managers who keep query aging under control — most queries resolved within 30 days of opening — arrive at database lock with weeks of work rather than months of work remaining.
Database lock preparation brings together data management, biostatistics, clinical operations, and medical monitoring to confirm that every significant data issue has been resolved, all external data sources have been reconciled, and the audit trail is complete. On complex programs, this process involves formal sign-off meetings and may span several weeks. The locked database then goes to biostatistics for the primary analysis that drives the regulatory submission.
Qualifications
Education:
- Bachelor's degree in a life science, computer science, health informatics, or statistics
- Master's degree in clinical research, health informatics, or biostatistics is valued for senior DM roles
- CCDM (Certified Clinical Data Manager) through SCDM is the primary professional certification for this field
Technical skills:
- EDC platforms: Medidata Rave (most widely used; Rave Designer and Rave Architect skills are distinct), Oracle InForm, Veeva Vault EDC
- CDISC standards: CDASH for data collection design, SDTM for submission datasets, ADAM for analysis datasets
- Medical coding: MedDRA hierarchy and coding conventions, WHO Drug dictionary, coding review processes
- SAS: DATA step and PROC SQL for data cleaning and SDTM conversion (expected for senior and lead roles)
- Python: pandas and data validation scripting increasingly requested as an alternative or complement to SAS
Regulatory knowledge:
- ICH E6(R3): data management requirements and essential documents for GCP compliance
- FDA 21 CFR Part 11: electronic records and electronic signatures requirements for EDC validation
- FDA Technical Conformance Guide for CDISC data standards in NDA/BLA submissions
- Data Management Plan development and sponsor SOP compliance
Soft skills:
- Methodical attention to detail in validation logic — an edit check that fires incorrectly generates unnecessary queries and erodes site trust
- Cross-functional communication: data managers field questions from clinical, medical, biostatistics, and regulatory simultaneously
- Project management: managing query aging, database lock timelines, and UAT schedules across multiple studies
Career outlook
Clinical Data Managers occupy a specialized but essential niche in the clinical research workforce. Every clinical trial requires data management infrastructure, and the complexity of that infrastructure has grown substantially as regulatory data standards (CDISC), electronic patient-reported outcomes (eCOA), and decentralized trial data streams have multiplied the sources of data that must be integrated into a clean, submission-ready database.
The FDA's escalating expectations for CDISC-compliant data submissions in all NDA and BLA applications have created sustained demand for data managers who understand SDTM and ADAM standards. Companies that have historically run trials with non-standardized data structures are being required to convert and resubmit, creating remediation work alongside new trial design work.
AI is entering the data quality space in meaningful ways. Automated anomaly detection and AI-assisted query generation are reducing the manual workload for routine data cleaning. These tools are changing the role rather than eliminating it — experienced data managers shift from executing routine query cycles to overseeing AI-flagged patterns and making judgment calls about complex data issues that algorithms cannot resolve.
Decentralized trial data integration represents both a challenge and a growth area. When participants wear biosensors, complete eCOA instruments on tablets, and have blood drawn by at-home nurses who upload results to external systems, integrating all of those data streams into the master EDC requires the kind of data architecture and reconciliation work that data managers are uniquely positioned to do.
For career advancement, Lead Data Manager and Data Management Lead roles carry team management responsibility and earn $100K–$125K. Data Management Director and Head of Data Management at mid-size biotechs or CROs earn $130K–$175K. Data managers with SDTM programming experience can also move into biostatistics programming roles with significant salary uplift.
Sample cover letter
Dear Hiring Manager,
I'm applying for the Data Manager position at [Company/CRO]. I've been a clinical data manager at [Company] for four years, managing EDC design, data cleaning, and database lock on three Phase II oncology studies — two using Medidata Rave and one using Oracle InForm.
On my most recent study I served as the primary DM from protocol finalization through database lock. I built the Rave database from the annotated CRF, developed the DMP, programmed 140 edit checks, led UAT with the clinical and biostatistics teams, and managed the query workflow for 32 sites through a 24-month enrollment period. We hit database lock 10 days ahead of the milestone date committed to the regulatory team, with fewer than 20 outstanding minor queries at lock versus the 80+ that had been typical on prior programs.
The improvement came from tightening the UAT process. On previous studies, edit checks that fired incorrectly weren't caught until they generated site queries. I added a manual spot-check layer in UAT specifically for edit check false-positive rates, which identified 12 checks that needed logic adjustment before we went live. The result was a meaningfully lower query burden on sites and faster resolution.
I hold the CCDM credential and I've been working through the CDISC SDTM training modules to build submission standards knowledge for late-phase work. Your Phase IIb-to-III transition program looks like the right context to apply that training.
Thank you for your consideration.
[Your Name]
Frequently asked questions
- What is a database lock in clinical trials?
- Database lock is the formal process by which the EDC system is made read-only after data cleaning is complete and all outstanding queries are resolved. Once the database is locked, no further modifications can be made without a documented re-opening procedure. The locked database is the data set transferred to biostatistics for the primary statistical analysis. Database lock is a major trial milestone, and the quality of data management throughout the trial determines how clean and fast that process is.
- What is a Data Management Plan?
- A Data Management Plan (DMP) is a document created at study startup that defines how data will be collected, validated, cleaned, and transferred throughout the trial. It specifies the EDC system, edit check logic, coding conventions, query workflows, external data handling, and database lock procedures. The DMP is a required essential document under ICH E6 and is typically reviewed by the sponsor and may be reviewed by health authorities during inspections.
- What technical skills does a clinical Data Manager need?
- EDC configuration in at least one major platform (Medidata Rave, Oracle InForm, or Veeva Vault EDC) is the core technical requirement. SAS or Python programming for data cleaning and SDTM conversion is increasingly expected at senior levels. CDASH and SDTM data standards knowledge is essential for late-phase programs with NDA/BLA submission requirements. MedDRA and WHO Drug coding experience is standard for any DM role in pharma.
- How is the Data Manager role affected by AI tools?
- AI-driven anomaly detection is being applied to clinical data quality monitoring, flagging unusual patterns that might indicate data entry errors or site-level issues faster than traditional edit checks. Natural language processing tools are also being used for adverse event narrative coding. These tools reduce manual query generation for routine issues and shift the data manager's focus toward investigating complex data patterns that algorithms surface but cannot resolve on their own.
- What is CDASH and why does it matter?
- CDASH (Clinical Data Acquisition Standards Harmonization) is the CDISC standard that defines how clinical trial data should be collected at the site level — the field names, formats, and structures that make data readily convertible to SDTM for regulatory submission. FDA expects NDA and BLA submissions to include SDTM-compliant datasets, and building to CDASH from the start makes SDTM conversion more efficient. Data managers who understand CDASH design trials that are easier to submit.
More in Science
See all Science jobs →- Data Analyst$58K–$100K
Data Analysts collect, clean, analyze, and visualize data to help organizations make informed decisions. Working across industries from healthcare and biotech to finance and tech, they write queries, build dashboards, run statistical analyses, and communicate findings to both technical teams and non-technical stakeholders.
- Director of Clinical Operations$155K–$225K
Directors of Clinical Operations own the infrastructure, people, and processes that execute clinical trials across an organization. They build and lead teams of study managers and project staff, establish quality systems and SOPs, manage CRO partnerships at the strategic level, and ensure the entire clinical operations function delivers trials on time and within budget.
- Clinical Trial Manager$90K–$132K
Clinical Trial Managers oversee the operational execution of one or more clinical trials, managing study startup, site networks, CRO performance, enrollment timelines, and data quality from protocol activation through database lock. They lead cross-functional study teams and are the primary accountability point for a trial's schedule, budget, and GCP compliance.
- Director of Clinical Research$150K–$220K
Directors of Clinical Research provide scientific and operational leadership for clinical development programs, overseeing the design, execution, and regulatory strategy for trials from Phase I through Phase IV. They lead cross-functional development teams, manage external research partnerships, and are accountable for the scientific integrity and regulatory success of their programs.
- Director of Quality Assurance$140K–$205K
Directors of Quality Assurance build and lead the quality management systems, audit programs, and regulatory compliance infrastructure that protect organizations from FDA and EMA enforcement actions. They oversee internal and vendor audits, manage inspection preparedness, write and enforce quality SOPs, and serve as the primary quality authority in interactions with health authorities.
- Quality Assurance Auditor$68K–$108K
Quality Assurance Auditors assess whether pharmaceutical, biotech, and medical device manufacturing and laboratory operations conform to applicable GMP regulations, ISO standards, and internal quality system requirements. They conduct internal, supplier, and regulatory inspection readiness audits, document findings, track corrective actions, and help organizations maintain the compliance posture that protects product quality and patient safety.