Regulated Data Services Overview & Resources
Regulated Data Services
University Research Computing and Data (Univ RCD) Services offers comprehensive consulting, migration, and strategic planning support to research groups working with regulated data. Our team collaborates closely with research analysts, administrators, data providers, and compliance professionals to streamline research that is frequently hindered by lengthy setup times and complex legal technical controls. Our core contribution to this mission is the operations of the Regulated Data (ReD) Environment, an innovative Platform-as-a-Service designed to address these challenges. This page provides an overview of this unique platform and outlines the processes for gaining access.
What is the ReD Environment?
The ReD Environment is a computing, storage, and database environment for researchers working with highly-regulated data, which is subject to federal, state, and local laws or other contractual obligations to be maintained at a level that ensures privacy and security at federally recognized standards such as NIST 800-53 and NIST 800-171. The Regulated Data (ReD) Environment is hosted entirely in AWS and consists of two basic computing platforms for researchers:
(1) AWS Research Engineering Studio (RES), which provides a virtual desktop environment and application hosting. This is great for those researchers accustomed to a Windows Remote Desktop and associated applications. Within RES, the ReD Team can create custom workspaces that launch specific instances (cores, memory, GPUs) with specific OS images and software.
(2) AWS ParallelCluster + Open OnDemand, provides a scale out high-performance computing environment with a web front end through Open OnDemand and SLURM batch processing scheduler. Each Principal Investigator (PI) and/or project will have its own dedicated partition(s) that can be customized for the specific technology (instance type, cores, memory, GPUs) needs and capacity limits for that PI/project. Data storage is presented to the ParallelCluster as POSIX filesystem with access controls at the project level and is backed by S3.
Standard software offerings include Python, SAS, Stata, R, Bioconductor, Jupyter Notebook, LibreOffice, Globus, Julia, and soon to be MATLAB. Software offerings can be customized to meet research needs; this aspect is typically discussed as part of project onboarding.
How to Request Access
-
The ReD Environment is for researchers working with highly regulated data—data which is subject to federal, state, and local laws or other contractual obligations that mandate adherence to specific privacy and security standards. These standards require Univ RCD Services to gather detailed information during the account creation and onboarding process and place specific obligations on principal investigators and data managers (e.g. immediate notification to Univ RCD when a researcher leaves a project). This section is designed to share information required from new research groups and to communicate key responsibilities. Please read this section in its entirety.
Eligibility – New Labs or Groups
The following individuals are authorized to submit new access requests on behalf of research groups seeking to use the ReD Environment:
- Principal Investigators (PI)
- Data Managers (DM)
Please send in the ServiceNow request including the information below after carefully reading the content of this page. Both the PI and DM will also need to provide certificates of completion for the coursework listed below as part of the account provisioning process (and annually thereafter).
What is a Data Manager?
A DM is an individual explicitly designated in writing by the Principal Investigator (in a Service Now ticket or this form) to act on their behalf in managing data access and compliance responsibilities within the lab. This role is responsible for approving data exports, requesting new user accounts, and must immediately notify platform administrators when a lab member leaves the respective project to ensure timely account deactivation. The DM is also responsible for ensuring that all lab members complete required training and submit their training certificates. The Office of the Vice Provost for Research outlines the broader responsibilities for this role in its Research Data Ownership Policy.
Eligibility – New Accounts for Existing Labs or Groups
Individuals who are both (1) named on Data Use Agreements (DUAs) for research projects already active within an established lab in the ReD Environment and (2) have an active Harvard Sponsored Role may request personal accounts. Each request must include explicit written approval from the Principal Investigator (PI) or Data Manager (DM). For simplicity, the PI or DM may submit account requests on behalf of individuals, as such requests automatically fulfill the written approval requirement. Individuals requesting new accounts for existing projects may apply for an account by writing to regulated_data_environment@harvard.edu with the following information for each team member who needs account access:
- First name
- Last name
- HUID
- Country of Citizenship
- Harvard Email Address
- Other Institutional Email Address
- Role Type (i.e., PI, DM, research analyst, doctoral candidate, etc..)
- DUA#’s for which the individual has access authorization
Compliance Training Certificates
For each research team member above, certificates of completion should also be attached to this form for each of the following courses as a prerequisite for obtaining access:
For the Harvard Training Portal, there are six modules organized under “Regulated Data Environment Trainings (ReD Users,” Playlist ID: 0000094291:
- Harvard Research Data Security Training Course (OVPR)
- Principles of Research Data Confidentiality (HR)
- Information Security Awareness Training (HUIT)
- Insider Threats Foundations (University-Wide)
- PII, Security and You (University-Wide)
- Regulated Data Environment – Rules of Behavior for Internal Non-Privileged Users (University-Wide).
From the CITI Program, you will need to complete one course consisting of several modules –
- Human Subjects Training: Social Behavioral Research Investigators – sign in with Harvard SSO
- ServiceNow Ticket: Information to Include
-
- PI & DM
- Project Name
- Brief Description of the project or purpose of the account
- Relevant DUA/DAT/IRB numbers from dua.harvard.edu
- Service Requested: Research Engineering Studio (GUI) or Open OnDemand + Parallel Cluster (CLI)
- List of additional users/collaborators needing access to the account if any
- CC the PI or DM if they choose not to file the request themselves
- Send to regulated_data_environment@harvard.edu referencing your project name in the title
- PI & DM
-
Next Steps
For new projects, someone from the ReD Team will reach out to schedule a discovery call or engage in asynchronous communication to ensure the details are correct as each setup is unique. For new accounts under existing projects, we can activate your account asynchronously and you will receive an email from URCDS/Okta with instructions to setup your MFA.
How do I get help?
-
- Submit a ticket to regulated_data_environment@harvard.edu and the ReD Team get back to you shortly.
- Attend Office Hours. Tuesdays 2-4pm.
- If you have an urgent regulatory matter in need of immediate attention, please work through your research administration or OVPR.
Data Import
-
- Due to the compliance requirements, data cannot be transferred into or out of the ReD Environment without proper authorization
- Researchers will use Globus High-Assurance transfer to move data into a “Sanitization Zone” where the data will be scanned for malware/virus.
- Once data is deemed sanitized it will be copied to your import location in the production service.
Details about this process are provided here after you have an account.
Data Export
-
- Due to the compliance requirements, data cannot be transferred into or out of the ReD Environment without proper authorization
- Researchers copy data in the production environment to their export location and notifies the ReD Team: regulated_data_environment@harvard.edu
- ReD Team scans this data for PII/PHI, and upon successful scan, the ReD Team seeks approval from the project’s Data Manager
Full details about this process are provided here after you have an account.
Production Storage
-
- A posix based filesystem is provided as an interface to both compute services.
- This offering protects data in two ways:
- Snapshots to protect against accidental file deletions
- Multi-site replication to protect against natural disasters.
Software Requests
-
- Due to the compliance requirements, software cannot be transferred into or out of the ReD Environment without proper authorization. The use of Posit Package Manager is allowed to pull R, Python, and Bioconductor packages, which must past a security screening in order to make egress.
- We advise users to build a container (Singularity or Docker) and transfer it into the Sanitation Zone in the Data Import steps above.
- Additional requests can be made by emailing regulated_data_environment@harvard.edu
- Due to the compliance requirements, software cannot be transferred into or out of the ReD Environment without proper authorization. The use of Posit Package Manager is allowed to pull R, Python, and Bioconductor packages, which must past a security screening in order to make egress.
Code Repositories
-
- Due to the compliance requirements, end-user code cannot be transferred into or out of the ReD Environment without proper authorization and screening.
- We leverage a static code analysis tool to evaluate your code and ensure it is without vulnerabilities prior to being transferred into the production environment.
- Prior authorization is needed to transfer code outside of the environment similar to the data export step
- Detailed information about this process can be found here after you have an account.
- Due to the compliance requirements, end-user code cannot be transferred into or out of the ReD Environment without proper authorization and screening.
Request Lab/Project Setup in ParallelCluster
-
- Each Lab/project will have its own dedicated SLURM partition(s) with restricted use for only this project. Only users in the associated project access control group will have access to these computing resources.
- Send a request ticket to regulated_data_environment@harvard.edu with the following information:
- List of users/collaborators to have access this project
- Please request the type of queues you would like to have from the list here (requires authentication). This allows you to customize the computing needs (processor type, cores, memory, network bandwidth).
- Once a request is received by the ReD team, a ParallelCluster SLURM partition for your project will be configured and you will receive a confirmation email from the ReD team.
Request Lab/Project Setup in Research Engineering Studio
-
- Each Lab/project will have its own dedicated Account and Workspace with restricted use for only this Workspace. Only users in the associated project access control group will have access to this Workspace.
- Send a request email to regulated_data_environment@harvard.edu with the following information:
- Name of Project and PI
- List of users/collaborators to have access this Workspace
- Designate the Workspace Type you would like to have:
- Windows
- Linux
Further details of this service can be found here after you have an account.
Logging In
-
- We provide a single sign on that is available for all of our services.
- Sign in requires an up-to-date web-browser and fully patched system.
Further details of this service can be found here after you have an account.
Offboarding a User
-
- If a user will be leaving a project or the institution it is important to notify the ReD Team at least 2 business days before their last day of access: regulated_data_environment@harvard.edu
- Even if the user will transfer to another institution and still work on the project, please update the ReD Team. In many cases an amendment to the DUA or IRB must be completed.