|
|
# Actes princiers -- refactoring datascience
|
|
|
|
|
|
## Project Name
|
|
|
|
|
|
human readable name : `Actes Princiers`
|
|
|
|
|
|
The project name 'Actes Princiers' has been applied to:
|
|
|
|
|
|
- The project title in `datascience/actes-princiers/README.md`
|
|
|
- The folder created for your project in `datascience/actes-princiers`
|
|
|
- The project's python package in `datascience/actes-princiers/src/actes_princiers`
|
|
|
|
|
|
A best-practice setup includes initialising git and creating a virtual environment before running 'pip install -r src/requirements.txt'
|
|
|
|
|
|
## Getting started
|
|
|
|
|
|
- Install the virtual environment : `python3 -m venv .venv`
|
|
|
- Enable the virtual environment : `source .venv/bin/activate`
|
|
|
- install kedro `pip install kedro`
|
|
|
- Install the packages and libraries `pip install -r src/requirements.txt`
|
|
|
|
|
|
**go to `actes-princiers`'s folder**
|
|
|
|
|
|
Then open a terminal in the `actes-princiers`'s folder
|
|
|
and launch jupyter : `kedro jupyter notebook`
|
|
|
or start the ipython prompt : `kedro ipython`
|
|
|
|
|
|
## Launching the pipelines
|
|
|
|
|
|
Open a terminal in the `actes-princiers`'s folder and launch kedro
|
|
|
|
|
|
`kedro run`
|
|
|
|
|
|
or launch a specific node in the pipeline with:
|
|
|
|
|
|
`kedro run --nodes=preprocess_html`
|
|
|
|
|
|
or a search by tags with:
|
|
|
|
|
|
`kedro run --tags=xsl`
|
|
|
|
|
|
## Visualizing the pipelines
|
|
|
|
|
|
`kedro viz`
|
|
|
|
|
|
## Building the docs
|
|
|
|
|
|
`./build-docs.sh docs`
|
|
|
|
|
|
the html built doc is `here <docs/build/html/>`_
|
|
|
|
|
|
## Developper's rules and guidelines
|
|
|
|
|
|
Declare any dependencies in `src/requirements.txt` for `pip` installation.
|
|
|
|
|
|
To install them, run: `pip install -r src/requirements.txt`
|
|
|
|
|
|
## tips
|
|
|
|
|
|
You need to reload Kedro variables by calling `%reload_kedro` in your notebook and re-run the code snippet
|
|
|
|
|
|
|
|
|
Par rapport aux bonnes pratiques kedro
|
|
|
------------------------------------------
|
|
|
|
|
|
Dans `actes-princiers/.gitignore`,
|
|
|
- les datas sont mises dans le dépôt git
|
|
|
- le datacatalog en local est placé dans le dépôt git
|
|
|
|
|
|
::
|
|
|
|
|
|
# ignore all local configuration
|
|
|
# conf/local/**
|
|
|
# ignore everything in the following folders
|
|
|
# data/**
|
|
|
|
|
|
## make a package for deployment
|
|
|
|
|
|
[package based deployment](https://docs.kedro.org/en/stable/deployment/single_machine.html#package-based)
|
|
|
|
|
|
If you prefer not to use containerisation, you can instead package your Kedro project using kedro package.
|
|
|
|
|
|
Run the following in your project’s root directory:
|
|
|
|
|
|
kedro package
|
|
|
|
|
|
Kedro builds the package into the dist/ folder of your project, and creates a .whl file, which is a Python packaging format for binary distribution.
|
|
|
|
|
|
|