diff --git a/docs/file-structure.md b/docs/file-structure.md new file mode 100644 index 00000000..57af3303 --- /dev/null +++ b/docs/file-structure.md @@ -0,0 +1,18 @@ +# File Structure + +All paths mentioned in this page are relative to RAPIDS' root folder. + +If you want to extract the behavioral features that RAPIDS offers, you will only have to create or modify the [`.env` file](https://www.rapids.science/setup/configuration/#database-credentials), [participants files](https://www.rapids.science/setup/configuration/#participant-files), [day segment files](https://www.rapids.science/setup/configuration/#day-segments), and the `config.yaml` file. The `config.yaml` file is the heart of RAPIDS and includes parameters to manage participants, data sources, sensor data, visualizations and more. + + +All data is saved in `data/`. The `data/external/` folder stores any data imported or created by the user, `data/raw/` stores sensor data as imported from your database, `data/interim/` has intermediate files necessary to compute behavioral features from raw data, and `data/processed/` has all the final files with the behavioral features in folders per participant and sensor. + +All the source code is saved in `src/`. The `src/data/` folder stores scripts to download, clean and pre-process sensor data, `src/features` has scripts to extract behavioral features organized in their respective subfolders , `src/models/` can host any script to create models or statistical analyses with the behavioral features you extract, and `src/visualization/` has scripts to create plots of the raw and processed data. + +There are other important files and folders but only relevant if you are interested in extending RAPIDS (e.g. virtual env files, docs, tests, Dockerfile, the Snakefile, etc.). In the figure below, we represent the interactions between users and files. After a user modifies `config.yaml` and `.env` the `Snakefile` file will decide what Snakemake rules have to be executed to produce the required output files (behavioral features) and what scripts are in charge of producing such files. In addition, users can add or modifiy files in the `data` folder (for example to configure the [participants files](https://www.rapids.science/setup/configuration/#participant-files) or the [day segment files](https://www.rapids.science/setup/configuration/#day-segments)). + +
+ +
Interaction diagram between the user, and important files in RAPIDS
+
+ diff --git a/docs/img/files.png b/docs/img/files.png new file mode 100644 index 00000000..58ee5dec Binary files /dev/null and b/docs/img/files.png differ diff --git a/docs/index.md b/docs/index.md index 5c200d6a..c95b1133 100644 --- a/docs/index.md +++ b/docs/index.md @@ -10,6 +10,8 @@ RAPIDS is open source, documented, modular, tested, and reproducible. At the mom :fontawesome-solid-tasks: Join our discussions on our algorithms and assumptions for feature [processing](https://github.com/carissalow/rapids/issues?q=is%3Aissue+is%3Aopen+label%3Adiscussion). +:fontawesome-solid-play: Ready to start? Go to [Installation](https://www.rapids.science/setup/installation/) and then to [Initial Configuration](https://www.rapids.science/setup/configuration/) + ## How does it work? RAPIDS is formed by R and Python scripts orchestrated by [Snakemake](https://snakemake.readthedocs.io/en/stable/). We suggest you read Snakemake's docs but in short: every link in the analysis chain is atomic and has files as input and output. Behavioral features are processed per sensor and per participant. @@ -26,15 +28,6 @@ RAPIDS is formed by R and Python scripts orchestrated by [Snakemake](https://sna 8. **Reproducible code**. You can be sure your code will run in other computers as intended thanks to R and Python virtual environments. You can share your analysis code along your publications without any overhead. 9. **Private**. All your data is processed locally. - - - ## How is it organized? -The `config.yaml` file is the only file that you will have to modify. It includes parameters to manage participants, data sources, sensor data, visualizations and more. - -All data is saved in `data/`. The `data/external/` folder stores any data imported by the user, `data/raw/` stores sensor data as imported from your database, `data/interim/` has intermediate files necessary to compute behavioral features from raw data, and `data/processed/` has all the final files with the behavioral features per sensor and participant. - -All the source code is saved in `src/`. The `src/data/` folder stores scripts to download, clean and pre-process sensor data, `src/features` has scripts to extract behavioral features organized in their respective subfolders , `src/models/` can host any script to create models or statistical analyses with the behavioral features you extract, and `src/visualization/` has scripts to create plots of the raw and processed data. - -There are other important files and folders but only relevant if you are interested in extending RAPIDS (e.g. virtual env files, docs, tests, Dockerfile, the Snakefile, etc.). \ No newline at end of file +In broad terms the `config.yaml`, `.env` files, participant files, and day segment files are the only ones that you will have to modify. All data is stored in `data/` and all scripts are stored in `src/`. For more information see RAPIDS' [File Structure](file-structure.md). \ No newline at end of file diff --git a/docs/workflow-examples/minimal.md b/docs/workflow-examples/minimal.md index 12159365..924e21eb 100644 --- a/docs/workflow-examples/minimal.md +++ b/docs/workflow-examples/minimal.md @@ -9,11 +9,11 @@ This is a quick guide for creating and running a simple pipeline to extract miss !!! info "Things to change on each configuration step" 1\. Setup your database connection credentials in `.env`. We assume your credentials group is called `MY_GROUP`. - 2\. Set `America/New_York` as the timezone + 2\. `America/New_York` should be the default timezone 3\. Create a participant file `p01.yaml` based on one of your participants and add `p01` to `[PIDS]` in `config.yaml` - 4\. Set `[DAY_SEGMENTS][TYPE]` to `PERIODIC` and `FILE` to a file containing the following lines: + 4\. `[DAY_SEGMENTS][TYPE]` should be the default `PERIODIC`. Change `[DAY_SEGMENTS][FILE]` with the path of a file containing the following lines: ```csv label,start_time,length,repeats_on,repeats_value daily,00:00:00,23H 59M 59S,every_day,0 diff --git a/mkdocs.yml b/mkdocs.yml index 7fe55080..25170edd 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -46,6 +46,7 @@ theme: pages: - Home: 'index.md' + - File Structure: file-structure.md - Setup: - Installation: 'setup/installation.md' - Initial Configuration: setup/configuration.md