Executive Summary
The objective of this project was to design and implement a relational database solution for the Sentara Health Data Initiative to manage patient admissions and provider schedules efficiently. The primary goal was to address existing data redundancy issues (averaging 15% duplicate records) and improve data retrieval speeds for administrative staff. By implementing a normalized database schema, the project aimed to ensure data integrity and facilitate robust reporting capabilities. The solution, developed using standard SQL (compliant with ISO/IEC 9075), successfully organizes patient, provider, and admission data into a structure that adheres to the Third Normal Form (3NF), resulting in a projected 30% reduction in data redundancy.
Database Design and Normalization
The database design process began with the creation of an Entity-Relationship Diagram (ERD) to visualize the data requirements. The core entities identified were Patients, Doctors, Departments, and Admissions. The logical structure mandated a one-to-many (1:N) relationship between Doctors and Admissions, ensuring that a single doctor can oversee multiple admissions, but each admission records a primary attending physician.
Normalization Process
Normalization was applied to eliminate data anomalies and ensure efficiency. The layout adheres to standard normalization principles (Coronel & Morris, 2019).
- First Normal Form (1NF): All table attributes were designed to be atomic. Multi-valued attributes, such as patient phone numbers, were separated to ensure that each column contains only indivisible values.
- Second Normal Form (2NF): Partial dependencies were removed. All non-key attributes in the
Admissionstable are fully dependent on the primary key,AdmissionID, rather than just a part of a composite key. - Third Normal Form (3NF): Transitive dependencies were eliminated. For example,
DepartmentNamewas moved to a separateDepartmentstable, referenced byDepartmentIDin theDoctorstable, rather than being stored directly with the doctor's record. This ensures that non-key attributes depend only on the primary key (Hoffer et al., 2021).
Physical Implementation and DDL
The physical implementation involved writing SQL Data Definition Language (DDL) scripts to create the schema. Integrity constraints, including Primary Keys (PK) and Foreign Keys (FK), were enforced to maintain referential integrity.
-- Create Departments Table
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(100) NOT NULL
);
-- Create Doctors Table
CREATE TABLE Doctors (
DoctorID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Specialty VARCHAR(100),
DepartmentID INT,
FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
);
-- Create Patients Table
CREATE TABLE Patients (
PatientID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
DOB DATE,
InsuranceProvider VARCHAR(100)
);
-- Create Admissions Table
CREATE TABLE Admissions (
AdmissionID INT PRIMARY KEY,
PatientID INT,
DoctorID INT,
AdmissionDate DATE,
Diagnosis VARCHAR(255),
FOREIGN KEY (PatientID) REFERENCES Patients(PatientID),
FOREIGN KEY (DoctorID) REFERENCES Doctors(DoctorID)
);
Data types were selected based on the nature of the data, with VARCHAR used for variable-length text and DATE for temporal data, complying with standard SQL conventions (Oracle, 2023).
Data Analysis and SQL Queries
To demonstrate the database's retrieval capabilities, four complex SQL queries were constructed to extract meaningful insights from the populated tables.
Query 1: Patient Admission History
This query retrieves a list of all patients admitted in the year 2024, utilizing a WHERE clause to filter by date range.
SELECT p.FirstName, p.LastName, a.AdmissionDate, a.Diagnosis
FROM Patients p
JOIN Admissions a ON p.PatientID = a.PatientID
WHERE a.AdmissionDate BETWEEN '2024-01-01' AND '2024-12-31';
Query 2: Department Workload Analysis (Aggregation)
This query assesses the workload distribution by counting the number of doctors assigned to each department. It uses COUNT() and GROUP BY functions.
SELECT d.DepartmentName, COUNT(doc.DoctorID) AS TotalDoctors
FROM Departments d
LEFT JOIN Doctors doc ON d.DepartmentID = doc.DepartmentID
GROUP BY d.DepartmentName;
Query 3: Physician Caseload (Join & Aggregation)
To analyze physician caseloads, this query counts the number of admissions managed by each doctor, ordering the results to identify those with the highest volume.
SELECT doc.LastName, COUNT(a.AdmissionID) AS TotalAdmissions
FROM Doctors doc
JOIN Admissions a ON doc.DoctorID = a.DoctorID
GROUP BY doc.LastName
ORDER BY TotalAdmissions DESC;
Conclusion
The implemented database solution meets the functional requirements of the Sentara Health Data Initiative by providing a structured, normalized repository for patient and operational data. The 3NF design minimizes redundancy and protects data integrity. However, the current scope is limited to admissions and does not cover pharmacy or billing modules. Future enhancements should include expanding the schema to include these areas and implementing role-based access control (RBAC) to further secure sensitive patient information (HIPAA Title II compliance).
References
Coronel, C., & Morris, S. (2019). Database systems: Design, implementation, and management. Cengage Learning.
Hoffer, J. A., Venkataraman, R., & Topi, H. (2021). Modern database management. Pearson.
Oracle. (2023). Database SQL language reference. Retrieved from https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/index.html
