The terms will be namedĪutomatically, 'x1' for the first input parameter, 'x2' for the second and so on.
with_term('a*b*c', lambda l: l * l * l) # Product of the first 3 input valuesĪdds a series of linear terms to the hypothesis, one for each of the input parameters in the training set. with_term('log(n)', lambda l: math.log(l, 10)) # Logarithm (base 10) of the 4th input value with_term('w^2', lambda l: l * l) # Square of the first input value This method should be used to add custom, non-linear terms to the hypothesis. This method requires a string value (the name that will be used to refer to the new term) and aįunction object accepting a single parameter, which will be a list containing all the input values for a single training example. with_termĪdds a single term to the hypothesis. However once this has been done error checking should be disabled in order to increase processing speed. Setting this can be useful when attempting to determine a reasonable learning rate value for a new data set, When set to True the utility will check the hypothesis error after each iteration, and abort if with_error_checkingĪ boolean value, defaulting to False. Learning rate value is largely a matter of experimentation - enabling error checking, as detailed below, can assist with this process. The value is set too high then it will fail to converge at all, yielding successively larger errors on each iteration. Up to a point, higher values will cause the algorithm to converge on the optimal solution more quickly, however if This method sets the learning rate parameter used by Gradient Descent when updating the hypothesisĪfter each iteration.
with_alphaĪ numeric value, defaulting to 1. Higher values will yield more accurate results, but will increase the required running time. This determines the number of iterations of Gradient Descent that will be performed before theĬalculated hypothesis is displayed. The Helper is configured using the following methods: with_iterationsĪn integer value, defaulting to 1000. A simple invocation might look something like this: Helper('house_price_data.txt') \ The Helper class has many configuration options, which are documented below. The wiring and instantiation of the other classes, and by providing reasonable defaults for many of the required configuration parameters. It is recommended that you use the Helper class to do this, which will simplify the use of the utility by handling
Lines beginning with a '#' symbol will be treated as comments and ignored.Īn extract from the House Prices data file might look like this: # House Price DataĪs well as supplying a training set, you will need to write a few lines of Python code to configure how the utility will run. Must be the same for each line in the file - any lines containing more/fewer input values than the first line will be rejected. Of the line should consist of a comma-separated list of the input values for that training example. A line must begin with the output value followed by a ':', the remainder To use the utility with a training set, the data must be saved in a correctly formatted text file, with each line in the fileĬontaining the data for a single training example. Using the selling price as the output value, and various attributes of the houses such as number of rooms,Īrea, number of floors etc. The hypothesis can then be used to predict what the output will be for new inputs, that were not part of the original training set.įor example, if you are interested in predicting house prices you might compile a training set using data from past property sales, To derive an equation (called the hypothesis) which defines the relationship between the input values and the output value. Each training example must contain one or more input values, and one output value. The utility analyses a set of data that you supply, known as the training set, which consists of multiple data items or Gradient Descent, these algorithms are commonly used in Machine Learning. This Python utility provides implementations of both Linear and