NEW BOT Телеграм, страница

Machine Learning & Artificial Intelligence | Data Science Free Courses

Python Data Science Essentials Third Edition

📓 Book

👍1

2.61K views12:17

Machine Learning & Artificial Intelligence | Data Science Free Courses

1. What is the AdaBoost Algorithm?
AdaBoost also called Adaptive Boosting is a technique in Machine Learning used as an Ensemble Method. The most common algorithm used with AdaBoost is decision trees with one level that means with Decision trees with only 1 split. These trees are also called Decision Stumps. What this algorithm does is that it builds a model and gives equal weights to all the data points. It then assigns higher weights to points that are wrongly classified. Now all the points which have higher weights are given more importance in the next model. It will keep training models until and unless a lower error is received.

2. What is the Sliding Window method for Time Series Forecasting?

Time series can be phrased as supervised learning. Given a sequence of numbers for a time series dataset, we can restructure the data to look like a supervised learning problem.
In the sliding window method, the previous time steps can be used as input variables, and the next time steps can be used as the output variable.
In statistics and time series analysis, this is called a lag or lag method. The number of previous time steps is called the window width or size of the lag. This sliding window is the basis for how we can turn any time series dataset into a supervised learning problem.

3. What do you understand by sub-queries in SQL?

A subquery is a query inside another query where a query is defined to retrieve data or information back from the database. In a subquery, the outer query is called as the main query whereas the inner query is called subquery. Subqueries are always executed first and the result of the subquery is passed on to the main query. It can be nested inside a SELECT, UPDATE or any other query. A subquery can also use any comparison operators such as >,< or =.

4. Explain the Difference Between Tableau Worksheet, Dashboard, Story, and Workbook?

Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
A workbook contains sheets, which can be a worksheet, dashboard, or a story.
A worksheet contains a single view along with shelves, legends, and the Data pane.
A dashboard is a collection of views from multiple worksheets.
A story contains a sequence of worksheets or dashboards that work together to convey information.

5. How is a Random Forest related to Decision Trees?

Random forest is an ensemble learning method that works by constructing a multitude of decision trees. A random forest can be constructed for both classification and regression tasks.
Random forest outperforms decision trees, and it also does not have the habit of overfitting the data as decision trees do.
A decision tree trained on a specific dataset will become very deep and cause overfitting. To create a random forest, decision trees can be trained on different subsets of the training dataset, and then the different decision trees can be averaged with the goal of decreasing the variance.

6. What are some disadvantages of using Naive Bayes Algorithm?

Some disadvantages of using Naive Bayes Algorithm are:
It relies on a very big assumption that the independent variables are not related to each other.
It is generally not suitable for datasets with large numbers of numerical attributes.
It has been observed that if a rare case is not in the training dataset but is in the testing dataset, then it will most definitely be wrong.

👍7

3.4K views05:31

Machine Learning & Artificial Intelligence | Data Science Free Courses

Machine Learning Tools

👍2

3.38K views18:35

Machine Learning & Artificial Intelligence | Data Science Free Courses

Microsoft has a data scientist career path

https://learn.microsoft.com/en-us/training/career-paths/data-scientist

Docs

Training for Data Scientists

Microsoft Learn helps you discover the tools and skills you need to become a data scientist.

👍2

3.65K views18:45

Machine Learning & Artificial Intelligence | Data Science Free Courses

Andrew Ng's new course on ChatGPT Prompt Engineering for Developers, created together with OpenAI, is available now for free!
👇👇
https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/

👍5

3.37K views09:03

Machine Learning & Artificial Intelligence | Data Science Free Courses

Machine Learning For Absolute Beginners

📚 book

👍4

3.37K views13:14

Machine Learning & Artificial Intelligence | Data Science Free Courses

Machine Learning in Microservices.pdf

12.4 MB

👍2

3.37K views14:41

Machine Learning & Artificial Intelligence | Data Science Free Courses

Building Data Science Applications with FastAPI (2023).pdf

11.3 MB

👍3

3.2K views04:36

Machine Learning & Artificial Intelligence | Data Science Free Courses

Some useful PYTHON libraries for data science

NumPy stands for Numerical Python. The most powerful feature of NumPy is n-dimensional array. This library also contains basic linear algebra functions, Fourier transforms, advanced random number capabilities and tools for integration with other low level languages like Fortran, C and C++

SciPy stands for Scientific Python. SciPy is built on NumPy. It is one of the most useful library for variety of high level science and engineering modules like discrete Fourier transform, Linear Algebra, Optimization and Sparse matrices.

Matplotlib for plotting vast variety of graphs, starting from histograms to line plots to heat plots.. You can use Pylab feature in ipython notebook (ipython notebook –pylab = inline) to use these plotting features inline. If you ignore the inline option, then pylab converts ipython environment to an environment, very similar to Matlab. You can also use Latex commands to add math to your plot.

Pandas for structured data operations and manipulations. It is extensively used for data munging and preparation. Pandas were added relatively recently to Python and have been instrumental in boosting Python’s usage in data scientist community.

Scikit Learn for machine learning. Built on NumPy, SciPy and matplotlib, this library contains a lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction.

Statsmodels for statistical modeling. Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of denoscriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.

Seaborn for statistical data visualization. Seaborn is a library for making attractive and informative statistical graphics in Python. It is based on matplotlib. Seaborn aims to make visualization a central part of exploring and understanding data.

Bokeh for creating interactive plots, dashboards and data applications on modern web-browsers. It empowers the user to generate elegant and concise graphics in the style of D3.js. Moreover, it has the capability of high-performance interactivity over very large or streaming datasets.

Blaze for extending the capability of Numpy and Pandas to distributed and streaming datasets. It can be used to access data from a multitude of sources including Bcolz, MongoDB, SQLAlchemy, Apache Spark, PyTables, etc. Together with Bokeh, Blaze can act as a very powerful tool for creating effective visualizations and dashboards on huge chunks of data.

Scrapy for web crawling. It is a very useful framework for getting specific patterns of data. It has the capability to start at a website home url and then dig through web-pages within the website to gather information.

SymPy for symbolic computation. It has wide-ranging capabilities from basic symbolic arithmetic to calculus, algebra, discrete mathematics and quantum physics. Another useful feature is the capability of formatting the result of the computations as LaTeX code.

Requests for accessing the web. It works similar to the the standard python library urllib2 but is much easier to code. You will find subtle differences with urllib2 but for beginners, Requests might be more convenient.

Additional libraries, you might need:

os for Operating system and file operations

networkx and igraph for graph based data manipulations

regular expressions for finding patterns in text data

BeautifulSoup for scrapping web. It is inferior to Scrapy as it will extract information from just a single webpage in a run.

👍9❤1

3.7K views09:10

Machine Learning & Artificial Intelligence | Data Science Free Courses

Deep Learning Foundations.pdf

14.3 MB

👍1

3.25K views15:08

Machine Learning & Artificial Intelligence | Data Science Free Courses

Exploring the Power of ChatGPT.pdf

19.1 MB

👍1

3.16K views03:42

Machine Learning & Artificial Intelligence | Data Science Free Courses

Python® ALL-IN-ONE

📓 Book

👍4❤1

3.5K views04:58

Machine Learning & Artificial Intelligence | Data Science Free Courses

"🔍 Data Integrity Alert: Always double-check your data sources for accuracy and consistency. Inaccurate or inconsistent data can lead to faulty insights. #DataQualityMatters"

👍1

3.39K views08:30

Machine Learning & Artificial Intelligence | Data Science Free Courses

"📊 Clear Objectives: Define clear objectives for your analysis. Knowing what you're looking for helps you focus on relevant data and prevents getting lost in the numbers. #AnalyticalClarity"

❤1👍1

3.62K views08:30

Machine Learning & Artificial Intelligence | Data Science Free Courses

📈 Context is Key: Interpret your findings in the context of your industry or domain. A seemingly significant trend might be trivial if it doesn't align with what's happening in your field. #ContextMatters"

3.63K views03:59

Machine Learning & Artificial Intelligence | Data Science Free Courses

Encyclopedia of Data Science & Machine Learning-J. Wang.pdf

261.8 MB

Encyclopedia of Data Science and Machine Learning
John Wang, 2023

👍7

3.77K views16:03

Machine Learning & Artificial Intelligence | Data Science Free Courses

Foundational Python for Data Science.pdf

26.3 MB

👍1

3.57K views02:46

Machine Learning & Artificial Intelligence | Data Science Free Courses

"💡 Start Simple: Don't overcomplicate your analysis. Begin with simple approaches and gradually explore more complex techniques as needed. Simplicity often leads to clarity. #StartSimple"

3.43K views08:31

Machine Learning & Artificial Intelligence | Data Science Free Courses

"🔗 Data Relationships: Understand the relationships between variables. Correlation doesn't always imply causation. Dig deeper to uncover the underlying reasons behind observed patterns. #DataConnections"

👍1

3.66K views08:31

Machine Learning & Artificial Intelligence | Data Science Free Courses

🔍 Missing Data Handling: Handle missing data wisely. Ignoring it or filling it with random values can distort results. Choose appropriate methods like imputation based on context. #MissingData"

3.96K views08:31

Machine Learning & Artificial Intelligence | Data Science Free Courses

Forwarded from Coding Interview Resources

Data_Engineering_Interview_Question_and_Answers_1682785467_1.pdf

955 KB

3.52K views18:01

About

Blog

Apps

Platform