List python libraries use for data science

Data Processing and Model Deployment


NumPy is a library used for numerical computing in Python. It provides fast array computations and is used for performing mathematical operations on arrays and matrices.


Pandas is a library used for data manipulation and analysis. It is used for reading, writing, and analyzing data in a tabular format.


Scikit-learn is a machine learning library used for building predictive models. It provides a wide range of tools for data preprocessing, feature extraction, and model selection.


TensorFlow is a popular library for building machine learning models, especially neural networks. It is widely used for deep learning tasks and provides tools for building, training, and evaluating models.


Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It is used for building and training neural networks.


The Natural Language Toolkit (NLTK) is a library used for natural language processing (NLP) tasks such as tokenization, stemming, and parsing.


SciPy is a library used for scientific computing in Python. It provides modules for optimization, integration, linear algebra, and statistics.


Statsmodels is a library used for statistical modeling and data analysis. It provides tools for regression analysis, time-series analysis, and hypothesis testing.


PyCaret is a fully accessible machine learning package for model deployment and data processing. It allows you to save time because it is a low-code library. It's a user-friendly machine learning library that will help you run end-to-end machine learning tests, whether you're trying to suggest missing values, analyzing categorical data, engineering features, tuning hyperparameters, or generating ensemble models.


Licensed under the BSD, OpenCV is a free machine learning and computer vision library. It offers a shared architecture for computer vision applications to streamline the implementation of computer vision in commercial products.

Data Mining and Data Scraping


An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.


BeautifulSoup is a Python data scraping and mining library that scrapes HTML and XML source data. It allows data scientists to develop a web crawler that crawls across websites. BeautifulSoup can retrieve data and structure it in the desired format. The scraped HTML data includes a lot of scrambled web data that users can't interpret. Its most recent version, BS4 (BeautifulSoup 4), arranges the jumbled web data into easy-to-understand XML structures, allowing for data analysis. BeautifulSoup identifies encodings automatically and smoothly interprets HTML documents, including those with special characters. We can search through a parsed document and find what we're looking for in it.

Data Visualization


Seaborn is a Python data visualization library based on Matplotlib. It provides more advanced and visually appealing visualizations than Matplotlib.


Matplotlib is a data visualization library that provides a wide range of graphs, charts, and plots.


ggplot is a Python implementation of the grammar of graphics. It is not intended to be a feature-for-feature port of ggplot2 for R--though there is much greatness in ggplot2, the Python world could stand to benefit from it. So there will be feature overlap, but not neccessarily mimicry (after all, R is a little weird).

Plotly is an interactive, open-source, and browser-based graphing library for Python sparkles Built on top of plotly.js, is a high-level, declarative charting library. plotly.js ships with over 30 chart types, including scientific charts, 3D graphs, statistical charts, SVG maps, financial charts, and more. Plotly graphs can be viewed in Jupyter notebooks, standalone HTML files, or integrated into Dash applications.


Altair is a declarative statistical visualization library for Python. Its simple, friendly and consistent API, built on top of the powerful Vega-Lite grammar, empowers you to spend less time writing code and more time exploring your data.


Automatically Visualize any dataset, any size with a single line of code. Now you can save these interactive charts as HTML files automatically with the "html" setting.