Skip to content

Features Detective App

Date of creation: 2024-10-30

Project description

The aim of the project was to create a universal application that allows for detecting the most important features in a given data set. In short - the user uploads data or loads a ready data set in the appropriate format, then selects automatic detection of the column they want to analyze or makes this selection themselves. Finally, they receive a generated graph of the significance of features that have the greatest impact on the previously selected column. The user also receives a clear description of the graph along with recommendations - what can be improved to, for example, improve the analyzed data.

Main functionalities

  • The user can load a CSV/JSON file with data or use a ready-made sample dataset
  • The user indicates the target column -> additionally, they can use automatic column detection (generated by LLM)
  • The application automatically recognizes whether the loaded data is related to the regression or classification problem and selects the appropriate AI model training algorithm on this basis
  • Based on the trained model, a chart containing the most important features is displayed
  • Finally, the user receives a clear description of the chart along with recommendations - what actions to implement to improve the results related to the analyzed target data column

ML model training

I used PyCaret tools and I have included the implementation in a notebook ready for download:

Download Notebook: Model training

Skills

  • Python
  • Langfuse
  • OpenAI
  • Streamlit
  • PyCaret (Classification & Regression)
  • Pandas
  • Matplotlib
  • Instructor
  • Pydantic
  • Boto3

Sample photos

alt text alt text alt text alt text

Link to repository