Placeholder Image

Subtitles section Play video

  • Everyone who uses Python for scientific computing uses NumPy, a third-party package allowing

  • us to work with multidimensional arrays.

  • They are a powerful way of organizing and processing data.

  • Hence NumPy is a fundamental library for those attempting to manipulate larger chunks of

  • information.

  • When your calculations are ready, it is useful to know how to present the results obtained

  • in a graph.

  • Matplotlibis a two-dimensional plotting library specially designed for visualization

  • of Python and especially NumPy computations.

  • It contains a large set of tools allowing you to customize the appearance of the graphs

  • you are working with.

  • Finally, TensorFlow is the main library we will use for machine learning.

  • Okay.

  • The good thing about Anaconda is that NumPy and Matplotlib were installed automatically

  • with it.

  • That’s a strong plus of Anaconda.

  • You don’t have to install the main packages separately as you might have to do if you

  • were to use some other software for programming in Python.

  • However, TensorFlow is not included in the automatically installed packages.

  • So, we will have to do that on our own.

  • That’s a useful programming skill to have.

  • So, the quickest way to install it is by opening your start menu and searching for theAnaconda

  • Prompt”.

  • Typepip install TensorFlowin the following way.

  • Pippermeans eitherpip install packagesorpip installs Python”.

  • Strangebut true.

  • Now pressEnter”.

  • When the operation is done, you will be set up for the whole course.

  • This is one of the quickest and easiest ways to install modules and packages on your computer

  • in general.

  • I recommend you to alsopip install sklearn’.

  • Great!

  • Now that weve covered the preliminary preparations, we can focus on using Python to create our

  • first machine learning algorithm.

  • OkThis is our actual TensorFlow intro.

  • Once again, TensorFlow is a machine learning library developed by Google.

  • It allows us to construct fairly complicated models with little coding.

  • To give you a perspective, our practical example required 20 lines of code.

  • With TensorFlow, it will still be 20 lines of code.

  • No difference whatsoever.

  • TensorFlow is an amazing framework, and you will see that by the end of the course.

  • However, it has a peculiar underlying logic.

  • Remember when you first studied linear algebra or trigonometry?

  • The logic differed greatly from the mathematics you had seen before, right?

  • Well, it’s the same with TensorFlow.

  • Once you start working with it, it’s super easy, but you must make an extra effort to

  • understand it properly.

  • Let’s do that.

  • The most basic notion well need to define is the computational graph.

  • Quoting TensorFlow’s ‘Aboutsection at tensorflow.org, ‘nodes in the graph represent

  • mathematical operations, while the edges represent the multidimensional data arrays, or tensors,

  • communicated between them’.

  • We won’t be drawing computational graphs, but that’s how such a graph looks

  • Simple as that.

  • Okay.

  • Let’s start coding, and well grasp the rest of the intuition on the go!

  • Naturally, well start by importing the TensorFlow library.

  • We will import tensorflow as tf.

  • Then, we will generate fake data, once again.

  • This code is virtually the same as the one we used before.

  • There is a single line of code difference.

  • Let’s look at it.

  • For each project you work on, youll have a dataset.

  • Perhaps, you are used to xlsx or csv files; however, TensorFlow doesn’t work well with

  • them.

  • It is tensor based, so it likes tensors.

  • Therefore, we want a format that can store the information in tensors.

  • One solution to this problem is npz files.

  • That’s basically NumPy’s file type.

  • It allows you to save nd arrays or n-dimensional arrays.

  • Thinking like computer scientists, we can say tensors can be represented as multidimensional

  • arrays.

  • When we read an npz file, the data is already organized in the desired way.

  • Often, this is an important part of machine learning preprocessing.

  • You are given data in a specific file format.

  • Then you open it, preprocess it, and finally save it into an npz.

  • Later, you build your algorithm, using the npz, instead of the original file.

  • So.

  • Back to the code.

  • As you can see, we have called the inputs and targets we generated: generated inputs

  • and generated targets.

  • Next, we can simply save them into a tensor friendly file.

  • The proper way to do that is to use the np savez method.

  • It involves several arguments.

  • The first one is the file name.

  • It is written in quotation marks.

  • I’ll call it TF underscore intro.

  • Then we must indicate the objects we want to save into the file.

  • The syntax is as follows.

  • The label we want to assign to the nd array equals the array we want to save under that

  • label.

  • For us, the label is inputs and is equal to the generated inputs array.

  • Similarly, the targets are equal to the generated targets.

  • Note it is not required to call them inputs and targets.

  • If we would like to, we could call them with arbitrary names, such as Rad1 and Rad2.

  • Executing the code would save the TF_intro file in the same directory as the Jupyter

  • notebook we are using.

  • Okay.

  • Ok.

  • Great!

  • We will create two variables that measure the size of our inputs and outputs.

  • The input size is 2, as there are two input variables, the Xs and the Zs we saw earlier,

  • and the output size is 1, as there is only one output - y.

  • Okay.

  • These two lines of code assigned the values 2 and 1 to the variables input size and output

  • size.

  • Time for the peculiar TensorFlow logic.

  • Each object we will create using the TensorFlow library would do nothing, unless explicitly

  • told to.

  • It would rather describe the logic of the machine learning algorithm, but won’t assign

  • values or execute anything.

  • Remember that, as it is crucial.

  • Time to define our first TensorFlow object, the placeholder.

  • Let’s see the line of code that allows us to do that.

  • Inputs equals: tf dot placeholder of tf dot float 32, comma, square brackets, None, input

  • size.

  • The placeholder is where we feed the data.

  • The data contained in our dataset would go into a placeholder.

  • Naturally, we feed both inputs and targets.

  • Let’s include the code for the targets.

  • The data in the npz file contained exactly the inputs and the targets.

  • We will use the npz to feed data into the model through the placeholders.

  • The float 32 indicates the type of data we want.

  • 32 bits float precision is sufficient for most calculations.

  • Finally, we have the dimensions of the two placeholders.

  • The inputs dimensions are None by input size.

  • The input size is the number of input variables we have.

  • TheNoneyou see here doesn’t mean the data has no dimension.

  • Instead, it means we need not specify it.

  • That’s useful for us, lazy users.

  • We need not know the number of observations or keep track of it.

  • TensorFlow does that for us.

  • We are only interested in the number of input variables.

  • As you can see, this isn’t much different from our linear model.

  • The dimensions we are working with are n by k, where n is the number of observations,

  • and k is the number of variables.

  • Okay.

  • Similarly, the size of the targets is None by the output size.

  • That’s natural, since outputs and targets have the same shape.

  • Remember!

  • Nothing has happened yet.

  • We have instructed the algorithm how we will feed data, but no data has been fed.

  • Alright.

  • The next thing we would like to do is define the weights and biases.

  • They are declared using the other basic TensorFlow objectvariable.

  • Variables preserve their value across iterations, while placeholders don’t.

  • Allow me to elaborate, please.

  • Let’s say we have 2 inputs: A and B, and 2 targets: C and D.

  • A goes into the inputs placeholder.

  • Through the weights and biases, we can obtain an output.

  • Then, we compare the output with the target C. Depending on the comparison, well vary

  • the weights and biases and continue with the next iteration.

  • Then B goes into the inputs placeholder.

  • Through the updated weights and biases, we calculate the new output, which will then

  • be compared to the second target - D. Then, well adjust the weights and biases, once

  • again.

  • Ok.

  • A and B came and went away.

  • We fed them to the model, got the best out of them, and we no longer need them.

  • The weights and biases, however, were preserved throughout the iterations we performed.

  • More precisely, we updated them and kept their updated version using the information provided

  • by A and B.

  • This is the same process we carried out before, but we used a different wording to describe

  • it.

  • We feed the data in the placeholders and vary the variables.

  • Simple as that.

  • Let’s define the variables.

  • The proper method is TF dot variable.

  • The expression you see within the brackets shows us how the variables will be initialized.

  • We will use the random uniform method to be consistent with the minimal example shown

  • earlier.

  • This is something well discuss in more detail later in the course.

  • Notice we have the same shapes as the ones we had in our framework.

  • Weve prepared the inputs, weights, biases, and targets.

  • We still need the outputs.

  • The outputs of our model are the product of the linear combination of the inputs times

  • the weights plus the biases.

  • The proper syntax in TensorFlow is: Outputs equals: tf dot matmul of inputs and

  • weights, plus the biases.

  • Matmul relates to the same concept as the dot product, but generalized for tensors.

  • In the brackets, we plug in what we want to multiply.

  • So, this is our linear model, right?

  • Before we can continue, I would like to ask you a question.

  • What will happen if I run the code?

  • Nothing!

  • Our model will do nothing.

  • It doesn’t initialize the weights or biases, nor it plugs in any values.

  • It certainly doesn’t calculate any outputs.

  • It lays the logic through which this will happen later.

  • That’s what TensorFlow logic is all about.

  • Okay.

  • Let’s see where we are in terms of our machine learning framework.

  • We have the data and the model.

  • What we need is an objective function and an optimization algorithm.

  • The loss function we used in the practical example was the average L2 norm loss divided

  • by 2.

  • Let’s call it mean loss.

  • In TensorFlow, its syntax would be: Mean loss equals:

  • Tf, dot losses, dot mean squared error of: the targets and outputs divided by 2.

  • Let’s consider this in more detail.

  • TF is our library.

  • There is a module called losses.

  • Losses contains most of the common loss functions.

  • Then, from losses, we call the method mean_squared_error, which is the average L2 norm loss.

  • Mean_squared_error has two arguments: labels and predictions.

  • The labels are the targets, as we discussed earlier.

  • In supervised learning, we must assign labels, which are the correct values.

  • Obviously, the labels are the targets.

  • The second argument of the mean_squared_error is the predictions following our model.

  • These are the outputs.

  • Finally, well divide by 2, so the code is the same as in the previous example.

  • Working with TensorFlow, this is an adjustment we no longer need.

  • I will only keep it for consistency.

  • The naught after the number 2 is added there, so we will be certain well obtain a float.

  • This is a good programming habit that will save you lots of trouble.

  • Okay.

  • So, what does this line of code do?

  • It only defines the objective function we will use.

  • That’s it.

  • It is time to introduce the optimization method.

  • Let’s call it optimize.

  • Optimize equals: Tf dot train, dot gradient descent optimizer,

  • of learning, underscore, rate equals 0.05, dot, minimize of mean_loss.

  • The method train contains the most common optimization algorithm.

  • The gradient descent optimizer is the one we will use.

  • Naturally, there are other optimization algorithms besides this one.

  • We will talk about some of them in the next sections of the course.

  • The learning rate is an argument of the optimizer, and this is where we set it.

  • Finally, we must minimize the loss function that we called: “mean loss”.

  • Okay.

  • We used 7 lines of code to define all the relationships in our model, the loss function

  • and the optimization algorithm.

  • In the TensorFlow world, not a single operation has been executed just yet.

  • Weve built all the passages and rollercoasters our data will go through, but we have not

  • opened the gates for any of it yet.

  • There is a special syntax when we want to do that.

  • The proper method is: tf dot interactive session, braces.

  • I will declare a variablesessequal to this operation.

  • In this way, I will only need to write sess, not the whole interactive session method,

  • when I want to execute something.

  • The name sess is short for session.

  • That’s the conventional way to declare a session.

  • In TensorFlow, the training happens in these so-called sessions.

  • When we use the interactive session method, we actually say: it’s time to execute.

  • Alright.

  • The entire framework is ready.

  • It is time for the algorithm to learn.

  • We said that’s done by varying the weights and the biases.

  • Before this happens, though, we must set the arbitrary values from which weights and biases

  • start.

  • This is the process known as initialization.

  • So, let the initializer be equal to:m Tf dot global variables initializer, brackets.

  • That’s the method that initializes all tensor objects marked as variables.

  • We have twoweights and biases.

  • They will be the only ones affected.

  • Okay.

  • Did we initialize?

  • No!

  • We haven’t executed yet!

  • So, how do we execute?

  • Execution is done through the method run applied to the session.

  • For each execution, we must type: sess dot run,

  • and in brackets, well indicate what we want to run.

  • Therefore, to initialize the variables, we must write:

  • sess dot run, initializer.

  • AwesomeThe variables have been initialized.

  • Around this time, we usually import the data.

  • Our training data comes from the npz we saved a couple of lessons ago.

  • So, training data equals np load, TF intro dot npz.

  • Once again, this method will only load the file if it is in the same directory on your

  • computer.

  • Otherwise, we must specify the entire path.

  • What is left is the for loop that would minimize the loss function.

  • So, for, e in range 100, semicolons.

  • E stands for epoch.

  • Each iteration over the full dataset in machine learning is called an epoch.

  • From now on, well use this term to describe iterations and number of iterations.

  • We still have two lines of code we must go through.

  • Hold on tight, pleaseLet’s call the loss at the current epoch

  • curr loss”.

  • This is basically the same as what we had earlier.

  • We will calculate the loss function at the end of each iteration or epoch.

  • Let’s look at the syntax in TensorFlow.

  • Curr loss is equal to: sess dot run, of course, since we want to execute something.

  • Then, in parentheses, we have 2 arguments.

  • The first argument is a list in which we state wrhat we want to run.

  • In this case, we would like to run the optimization declared earlier.

  • I will also compute the mean loss, as I want to be able to print it later.

  • The second argument is feed dict.

  • It tells the algorithm how the data is going to be fed.

  • The syntax is: Feed underscore dict equals: braces.

  • Now we must instruct it which placeholder is fed what.

  • First, we put the placeholder, so inputs, semicolons.

  • Then, we feed it from the training data, the tensor called inputs.

  • That’s the syntax.

  • Recall that training data was where we loaded the npz file.

  • The two labels of the npz file were inputs and targets and are self-explanatory.

  • Similarly, the targets come from our npz and were labeled targets.

  • So, we have: Coma, targets, semicolons, training data,

  • targets.

  • And finally, we must close all the brackets we used.

  • That’s it.

  • The meaning of this line of code is the following: Run the optimize and mean loss operations

  • by filling the placeholder objects with data specified in the feed dict parameter.

  • Cool.

  • There is a little programming trick some of you may not have seen.

  • Since every method returns something, the curr_loss will actually return something for

  • optimize and something for the mean loss.

  • The mean loss is a value, and we already saw that.

  • However, optimize doesn’t return anything.

  • Its output is alwaysNone’.

  • The underscore is a special symbol to disregard a return value of a function.

  • We don’t need the return value of optimize, so well disregard it.

  • Okay.

  • Finally, I’ll print the loss at each epoch.

  • That’s our final machine learning algorithm created with TensorFlow.

  • There are just 10 lines of code specific to the problem.

  • Let’s run it.

  • Naturally, we get the same result.

  • We can further plot the data.

  • I won’t go through the code, but you can see the explanation related to it in the course

  • resources.

  • The point is that the graph is the same as before.

  • Alright.

  • There is no need to comment on the actual result.

  • It is far more interesting to comment on the use of TensorFlow.

  • Notice that the loop we used was so generic we can apply this exact same code for different

  • machine learning algorithms.

  • It calculates the current loss, given an optimizer, a loss function, inputs, and targets.

  • All these arguments were defined before the loop using the TensorFlow structure.

  • Whenever we need to build a new model, we can change the appropriate objects, rather

  • than modify the loop or structure as a whole.

  • TensorFlow code is extremely reusable.

  • Further on, we will see that iterating through the data is not always so simple if we want

  • amazing results.

  • However, the logic is the same.

  • This proves useful in real-life, as the code youll need doesn’t need to be adjusted

  • too much, and it is easy to share your code with others.

  • Okay.

  • This was our introduction to TensorFlow.

  • Thanks for watching!

Everyone who uses Python for scientific computing uses NumPy, a third-party package allowing

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it