# Understanding the Mathematics behind the Data Science Engineering

Data science Engineering, is a culmination of Science, Mathematics, Huge Data Sets, Programming skills on working with Large Data Sets and developing a solution that can be run on particular hardware or cloud infrastructure efficiently.

Data Science problems take up huge chunks of computing resources and can also take considerable time to output a result, even running on Supercomputer. This also means if you are running a cloud application with a small Machine Learning or Data Analytics plugin, its sufficient enough to take up the entire computational core, memory and deadlock the application for a long time, and a huge bill end of the month. The importance of understanding the mathematics under the hood is fine-tuning a mathematical formula to fit the problem in hand and provide the best possible efficient solution.

In mathematics, there are numerous ways to solve a problem, and it’s always up to the mathematicians to come out with the best solution that can work problems with the same patterns.

As mentioned earlier, data science engineering, being a combination of technologies and science at work, efficiency is distributed over layers of technology, but the foremost importance is the design of the solution itself, which comes from mathematical modeling.

The following is a list of mathematical theories, techniques, and methods widely used in Datascience with their application scenarios.

 Topic Commonly Used Theories Applications Functions, Variables, Equations, and Graphs Linear Regression Cost functions Data plotting Pattern Identification Forecasting Statistics ProbabilityStatistical inferenceValidationEstimates of errorconfidence intervals Prediction Pattern Recognition Linear Algebra Matrices Vectors Tensors Dimensionality reduction techniques Hugely useful for compact representation of linear transformations on data transformations on data – Dimensionality reduction techniques Calculus Differentials Integrals Partial Derivatives Partial Integrals Multivariate Calculus Integral Transforms Functions of single variable, limit, continuity and differentiability,Mean value theorems, indeterminate forms and L’Hospital rule,Maxima and minima,Product and chain rule,Taylor’s series, infinite series summation/integration conceptsFundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,Beta and Gamma functions,Functions of multiple variables, limit, continuity, partial derivatives,Basics of ordinary and partial differential equations (not too advanced) Anywhere there is rate of change of data, curves, signals. Discrete Math Sets, subsets, power setsCounting functions, combinatorics, countabilityBasic Proof Techniques — induction, proof by contradictionBasics of inductive, deductive, and propositional logicBasic data structures- stacks, queues, graphs, arrays, hash tables, treesGraph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloringRecurrence relations and equationsGrowth of functions and O(n) notation concept Classification Problems, Data Predictions, Data Structure Design, Cryptography, Encryption, Decryption Numerical Optimization and Operation Research Maxima, minima, convex function, global solutionLinear programming, simplex algorithmInteger programmingConstraint programming, knapsack problem Logistics Supply chain management Scaling Optimization

Fortunately for data science engineering, almost all mathematical formulations have been implemented in libraries like numpy, scypy, scikit, scikit-learn etc. Knowing the mathematics will help in selecting the right library and making the best use of it.

Here in our articles we will cover the use of various mathematical techniques and concepts in real world data science scenarios. Although its best and easy to use software like Octave and Matlab to play around with the concept, we will go by a more realistic approach, doing it the hardcoding way using python and its libraries.

Phone: 512-539-0390
NJ Training Academy Inc , 405 Dry Gulch Bend
Cedar Park, Texas, 78613