Features Detective App
Date of creation: 2024-10-30
Project description
The aim of the project was to create a universal application that allows for detecting the most important features in a given data set. In short - the user uploads data or loads a ready data set in the appropriate format, then selects automatic detection of the column they want to analyze or makes this selection themselves. Finally, they receive a generated graph of the significance of features that have the greatest impact on the previously selected column. The user also receives a clear description of the graph along with recommendations - what can be improved to, for example, improve the analyzed data.
Main functionalities
- The user can load a CSV/JSON file with data or use a ready-made sample dataset
- The user indicates the target column -> additionally, they can use automatic column detection (generated by LLM)
- The application automatically recognizes whether the loaded data is related to the regression or classification problem and selects the appropriate AI model training algorithm on this basis
- Based on the trained model, a chart containing the most important features is displayed
- Finally, the user receives a clear description of the chart along with recommendations - what actions to implement to improve the results related to the analyzed target data column
ML model training
I used PyCaret tools and I have included the implementation in a notebook ready for download:
Download Notebook: Model training
Download Notebook: Model training
Skills
- Python
- Langfuse
- OpenAI
- Streamlit
- PyCaret (Classification & Regression)
- Pandas
- Matplotlib
- Instructor
- Pydantic
- Boto3
Sample photos
Application testing
The application has been deployed on the Streamlit Community App and is available for public use.
To use the application you need your OpenAI API Key.
To use the application you need your OpenAI API Key.