… and why they should be
If you are reading this, then most probably you already know quite well what functions are in programming. A function is quite a common and spread programming construct that is present in almost all programming languages.
Generally, a function is a block of code that takes some parameters from outside, executes some operations in which these parameters may be used, then it returns an output value. Actually, in many programming languages functions are allowed to not return something or to return multiple values, not only one. But these cases can be also be represented, for the sake of generality, as only one value. For the “no return” case we can use a special value to represent that (in Python that special value is None; whenever you don’t return something from a function it’s returned None). And for the case of more values, we can use a vector of multiple values as a single object.
For example, in Python a function definition looks like this:
def func_name(param1, param2, ...): # do some stuff with input parameters return output_value
Now, the question I want to ask: Are these functions from programming true mathematical functions?
Well…, let’s first recall what a mathematical function is.
In mathematics, a function is just a mapping from a set A to a set B, in which any element from A has only one associated element in B.
The following is an example of a function in math, which has 2 integers as input and outputs one rational number:
One key property of math functions is that each given pair of input values results in only one specific output, or in other words, it’s not possible to evaluate the same function with the same parameters twice and get two distinct outputs.
If we think about this property, it’s not hard to realize that this doesn’t hold for functions in programming.
We can have functions in programming that return different things for distinct calls of the function, even when the input arguments are the same.
For example, consider the following Python code:
i = 0 def f(x): global i i += 1 return x+i
If we call f(10) we will get 11 as output value; I mean the first time when we call it. If we call f(10) the second time, we will get 12, then 13, and so on. The variable i is incremented at each function call.
So, what we had written above is not a function in the mathematical sense (or a pure function, as they are often called).
The problem here is that our function f does not depend only on its input values; it also depends on a global variable and changes it. This behavior is the reason for which this “function” is not consistent in its output value.
So far, we identified a cause of inconsistent behavior of “functions”: depending on the outside world.
Is this the only problem?
Let’s imagine a little bit that we define another Python function, and this time it depends only on its input parameters, no global variables. Think about this before continue reading. This time it is a mathematical function or not?
The answer is no. So, what else can go wrong?
The other thing that can be a problem is causing side-effects. That is, besides the process of obtaining the output from input, it is still possible to do things that affect other parts of the program and therefore can determine functions to have inconsistent outputs.
For example, let’s say we have 2 functions f and g, and both of them take as input a list of integers. The function f adds 1 to the first integer in the list and returns it, and g adds 2 to the first integer and also returns it. But in many programming languages (including Python) lists are passed to functions by reference. This means that if we change the list inside a function, this change automatically takes effect everywhere else. In our example, if we call both f and g at the same time with the input list [1, 2, 3], the addition in f may be executed before the addition in g, and these may return 2, respectively 4, instead of 2 and 3 as we may expect. So, what happens in f may change the output of g or reversely. Therefore, f and g do not depend only on their input and they are not a pure function.
All right. So, we saw that functions in programming are not necessarily pure functions, and we also saw what the causes of this fact are: dependence on variables other than the input parameters and causing side-effects. But, what’s the problem with this? Why should we be concerned about these programming constructs not being pure functions?
It turns out that non-pure functions, due to their side-effects, can produce lots of problems. Causing lots of side-effects can make the flow of a program much harder to predict. They can produce some unexpected results.
Mathematical functions, unsurprisingly, had been extensively studied in mathematics, and we know more things about them, and their properties compared to the functions with side-effects which are harder to model mathematically and predict their behavior.
And those are some of the main reasons for which a new programming paradigm has been born: functional programming.
This paradigm of functional programming aims to use mostly pure functions and makes those it’s main characters. By using only pure functions, programs should have fewer bugs, be easier to debug, test, and prove their correctness.
There are also so-called pure functional languages, like Haskell, Lean, or PureScript. Contrary to impure languages, these do force the programmer to write the whole program by using only pure functions. Pure functional languages are not quite popular, probably because many people don’t like math, and due to the fact, that creating complex programs using only pure functions can be cumbersome in practice, despite the theoretical benefits.
Pure functions can also have some drawbacks in terms of efficiency. Let’s say we have a big list of numbers (e.g. 1 million) that we want to pass as an argument to a pure function that should do just a little change in one specific term and return the resulting list. Such a function it’s not allowed to make the change directly in the existing list of numbers since that will be a side-effect. So, instead, this function should make first a copy of the original data (those 1 million numbers) into another memory location, make the required change in that copy, then return this new list as its result and let the original list unchanged. This is quite inefficient compared to a non-pure function.
So, in my opinion, if one wants to make use of the functional programming paradigm and its benefits, the best way would be to use an impure functional language, which allows more freedom to the style a programmer can adopt, and to simply try to use pure functions as often as it seems reasonable to do so. If at some point, you think that it is worth sacrificing a pure function for the sake of being more efficient or easy to implement, I think that would be fine.
I hope you found this information useful and thanks for reading!
This article is also posted on Medium here. Feel free to have a look!