Skip to content

Features Detective App

Date of creation: 2024-10-30

Project description

The aim of the project was to create a universal application that allows for detecting the most important features in a given data set. In short - the user uploads data or loads a ready data set in the appropriate format, then selects automatic detection of the column they want to analyze or makes this selection themselves. Finally, they receive a generated graph of the significance of features that have the greatest impact on the previously selected column. The user also receives a clear description of the graph along with recommendations - what can be improved to, for example, improve the analyzed data.

Main functionalities

  • The user can load a CSV/JSON file with data or use a ready-made sample dataset
  • The user indicates the target column -> additionally, they can use automatic column detection (generated by LLM)
  • The application automatically recognizes whether the loaded data is related to the regression or classification problem and selects the appropriate AI model training algorithm on this basis
  • Based on the trained model, a chart containing the most important features is displayed
  • Finally, the user receives a clear description of the chart along with recommendations - what actions to implement to improve the results related to the analyzed target data column

ML model training

I used PyCaret tools and I have included the implementation in a notebook ready for download:

Download Notebook: Model training

Skills

  • Python
  • Langfuse
  • OpenAI
  • Streamlit
  • PyCaret (Classification & Regression)
  • Pandas
  • Matplotlib
  • Instructor
  • Pydantic
  • Boto3

Sample photos

alt text alt text alt text alt text

Application testing

The application has been deployed on the Streamlit Community App and is available for public use.
To use the application you need your OpenAI API Key.

Link to repository

Go to application