The importance of indentifying the assumptions inherent in each problem

_config.yml

My estimable colleague Rafik Ben made a blog post concerning a problem.

So what’s so good about Rafi’s problem?

Well like most things in the world, the problem, is not the actual problem. When it comes to computer science, the actual problem lies in identifying or understanding it. And the assumptions one makes are no less important.

Let’s see this in example:

A man and his son are driving in a car one day, when they get into a fatal accident. The man is killed instantly. The boy is knocked unconscious, but he is still alive. He is rushed to hospital, and will need immediate surgery. The doctor enters the emergency room, looks at the boy, and says…

“I can’t operate on this boy, he is my son.”

How is this possible?

It’s one of those things where you either have the answer instantly or you will never get it. It’s because your underlying assumptions and expectations were hidden and/or incorrect. The thesis of this post is this: identify the assumptions inherent behind every problem. For in doing so, you will be more likely to solve it successfully.

Assumptions area always inherent in a problem

Every problem has its own assumptions. And these assumptions will drive very different solutions. For example, in the post Rafi made:

• What if one assumes that the input into the “Pair with sum” function was a disorered list vs ordered list?

• What if one assumed that the input contained invalid data?

• Each assumption would in turn require it’s own unique solution. And those solutions might be drastically different. It should be remembered that not all solutions were created equal.

Assumptions to consider when formulating algorithms

Some assumptions which I feel are important when considering problems:

What are the domain/range of the inputs and outputs? (e.g. positive or negative, above zero or less than 400)
What are the characteristics of the inputs/outputs? (i.e. divisible by 2, integers only, irrational numbers?).
The data structure(s) of the inputs/outputs?
Can you think of a structure that will make your algorithm more efficient.
Is near enough, good enough? What are the costs of inaccuracy? Can we later fix inaccuracies?
Time limitations?
Computational power limitations?
Human factors:
Remember, algorithms are created for the user. You might have to compromise on efficiency and/or accuracy in order to meet this need. For example: I am positively sure that Google can do a lot better in its search results: but the boffins there are acutely aware that it’s bad business if the user is left waiting for a search result for more than a few seconds. Consequently, there are some compromises on accuracy in favor of speed. These factors are essential when considering a solution to a problem.

Another human related issue: the problem must be understandable to the coder, and easily maintainable:

Can you easily understand the problem by looking at it? Easily understood code makes maintenance a breeeze.
What parts do you think will change?

Summary

Understand the problem and its assumptions, and you’re halfway there.

Here is a gist basically listing the above:

Written on February 14, 2017