Week 4 — It’s all about science

Carlos Marín
15 min readMay 3, 2021

--

The objectives for this week are to learn about the Scientific Method and Software Testing. Interesting topics by Seth Lloyd, the life of Richard Feynman, the pretotyping manifesto, and topics about testing are the videos for this week; the last week of the Reset Phase.

Academy trying to finish the videos on time

I want to start this blog by writing about Why should have your own black box by Matthew Syed. This talk was very engaging for me because Matthew talks coherently about how the Growth mindset culture and psychological change are key pieces for the development of every area, activity, or work in society. In a complex world, talent is not enough, so it’s necessary hard work, practice, and resilience to make huge changes. This is the dynamic mindset that every person in his respective field should look for. He takes as an example the aviation of dynamic change. In the aviation industry, every accident is considered an opportunity for learning. The famous Black Box was created with this purpose. It records everything to avoid the accident happens again. In counterpart, the healthcare field has a fixed mindset because it is very self-justifying and it has a high blame culture. Just a psychological change can make enormous improvements. Don’t being afraid of asking or failing, and getting a Growth mindset culture could give immeasurable results.

The Pretotyping Manifesto is an interesting and useful talk by Alberto Savoia. During this talk, Alberto explains the ideas that consist in the prototyping manifesto.

  • Innovators beat ideas.
  • Pretotypes beat prototypes.
  • Doing beats talking.
  • Simple beats complex.
  • Now beats later.
  • Commitment beats committees.
  • Data beats opinions.

Using the ideas exposed by Savoia will have a great impact the next time I consider an idea to develop. They will save time and investment, and they will improve the possibilities of success. Running multiple experiments with small subsets of target population, making practical activities to corroborate the idea.

Computing a theory of everything by Stephen Wolfram gives another perspective about the universe and computing. He uses as an example his web app WolframAlpha. He explains the principle of computational equivalence, which tells us that even incredible simple systems can do computations as sophisticated as anything. In the computational universe, the rules are simple. But they can produce incredibly rich and complex behavior, so the question is the next. Does it also happen in the universe? Does the universe work on very simple rules?

The series of videos about the contributions and life of Richard Feynman is inspiring. Richard was a pioneer in the field of quantum electrodynamics. He has the capacity of explaining complex topics in simple forms.
He worked in the Manhattan project at The Alamos. Richard accomplished solving and explaining the theory of quantum electrodynamics with a simple diagram. He described a world that no one could do before him.

Richard also made some contributions to the computational sciences. Pipelining (parallel operations at the same time) and developed the idea of quantum computers.

In the video Feynman on Scientific Method, Richard describes how the scientific method truly works. Everything starts with a guesting which is computed and then the results are compared to nature. Richard tells that everything is not absolutely true or false. He thinks that if something is proved right in this moment, it doesn’t mean that it could be proved wrong in the future.

Seth Lloyd in his two videos explains the relation between Quantum and programming and machine learning. Seth tries to explain the complex nature of quantum mechanics and its implication in computer sciences. He explains superposition and how at the end combining the waves produced by electrons and get the final analysis. Also, he explains briefly how a quantum computer is built. In the machine learning (ML) video, Seth talks about the potential of quantum mechanics in ML. And it can get all the information from a database. Something impossible for conventional computers.

Tools for continuous integration is a talk by John Micco where he talks about the actions and tools used by Google. He explains the goals of continuous Integration and the features of Google’s continuous build system.

Continuous integration has three main goals:

  1. It provides real-time information to build monitors. (Identify failures fast, Identify culprit changes, Handle flaky tests.)
  2. Provide frequent green builds for cutting releases. (Identify recent green builds, Show results of all testing together, Allow release tooling to choose a green build, Handle flaky tests.)
  3. Develop safety. (Sync to last green change list, Identify whether change breaks the build before submit, Submit with confidence, Handle flaky tests.)

Google has its own Continuous Build System, which triggers on every change, it uses fine-grained dependencies and change 2 broke test 1.

Google Continuous Build System consists of the following benefits:

  • Identifies failures sooner.
  • Identifies culprit charge precisely.
  • Lower computing costs using fine-grained dependencies.
  • Keeps the build green by reducing time to fix breaks.
  • Accepted enthusiastically by product teams.
  • Enables teams to ship with fast iteration times.

Everything requires a cost, and it is also the case. It requires an enormous investment in compute resources grows in proportion to submission rate, average test time, variants, and increasing dependencies. And it requires updating dependencies on each change, takes time to update-delays the start of testing.

Presubmit is used to developing safely. It makes testing available before submitting. It uses fine-grained dependencies. Presubmit avoids breaking the build and captures contests of a change and tests in isolation.

System Architecture

In this talk, the speaker describes the flaky tests. When a system doesn’t assume tests pass or fail reliably given code, it is a flaky test. The sources of flaky tests can be infrastructure (machine fail or setup problems) or tests themself (race conditions or external dependencies). Having flaky tests means that you cant find changes that are breaking the build and inappropriately failing presubmits will happen, and there are going to be wasted work and compute resources. The solutions are to fix them (it is difficult), hide them and track them.

At the end of the talk, John explains the test growth that occurs. The sources of it are more developers, more tests, and longer running tests. There is a need to examine the growth tends to predict compute needs. The future plans are providing incentive for teams to optimize resources and create smarter scheduling.

I made the following list of the main things that I learn from the video of test engineering by Ivan Ho and LindsayPasricha:

  • Product quality and testability are a collective effort.
  • There should be a balance between quality and feature velocity.
  • The software engineers are focused on testability and development and release efficiency. They develop build, release and test automation frameworks.
  • Test engineers are focused on fictional testing and product and user. They read and debug product code, and write test code.
  • For automated testing, bots are the biggest thing to give feedback. They do unit tests (End to end with KIF, performance tests, and screenshot tests).
  • Manual testing is still very important. It is used when test cases that can’t be automated or new features that don’t have end to end tests yet.

In the video, I don’t test often.. but when I do, I test in production. Gareth Bowles explains the situation of Netflix and AWS as its deployment platform. Netflix is so complex, and its goal is Availability. According to Gareth, failures expected are disk fail, lack of power, bugs, and people mistakes. So it is important to design to avoid failure, but it won’t be enough and exhaustive testing is impossible. This the way how simian army was created and production code coverage implemented.

The purpose of Simian Army is to cause failure deliberately. They create real-world issues and prevent developers when they occur. The simian army consists in:

  1. Chaos Monkey. It kills random instances.
  2. Chaos Gorilla. It kills zones of deployment.
  3. Chaos Kong. It kills regions of deployment of service.
  4. Latency Monkey. it degrades the network and injects faults.
  5. Conformity Monkey. It looks for outliers.
  6. Circus Monkey. It kills and launches instances to maintain zone balance.
  7. Janitor Monkey. It cleans up unused resources.
  8. Security Monkey. It finds security issues and expiring certificates.

In Test coverage at Google by Andrei Chirila, he explains that test coverage measures are used to describe how much the tests exercise the code. There are types of coverage: function, statement, branch, and others. A healthy project has an 85% of coverage, availability of tools that push the number up, and keep in mind the language difference (python averages more than 80%, meanwhile C++ averages less than 60%).

In the talk The testing user experience, Alex Eagle speaks about the experience of testing for the users (developers). Testing is a hard problem, and he agrees that engineers need incentives for an engineer. According to the talk, Google built static analysis into the toll chain and produce suggested fixes. These are automatically applied across Google’s code, then enforced in the compiler. When an assertion fails, it prints out a message that’s clear about what’s going on, it is how to make engineers actually want to use it.

Then Alex introduces the breakage cycle:

  1. Do we need a human to take action?
  2. Find the right assignee and route communication.
  3. Explain the problem and likely causes.
  4. Success metric how quickly it was resolved.

At the end of the talk, Alex Eagle describes what is a test result. It is a complete representation of the results of a test that lets the engineer find the root cause more quickly. It does it using a screenshot of the browser, network analysis, constraints, and infrastructure features reporting. XUnit XML can’t represent any of those results. It’s necessary a new standard to let testing tools tell us what happened.

The last two materials are about chaos engineering and its importance for companies like Netflix, Google, Linkedln, Facebook, Amazon, etc. Chaos engineering is a disciplined approach to identifying failures before they become outages. By proactively testing how a system responds under stress, you can identify and fix failures before they end up in the news.

In 2014, chaos engineer as a new role is created. It gives developers more granular control over the blast radius of the failure injection.

Chaos engineering is about how systems behave in face of failure. Chaos engineering experiments consist of planning an experiment, contain the blast radius, and scale it. By breaking things on purpose we surface unknown issues that could impact systems and customers.

The benefits of chaos engineering are categorized into three:

  1. Customer: the increased availability and durability of service means no outages disrupt their day-to-day lives.
  2. Business: it prevents extremely large losses in revenue and maintenance costs.
  3. Technical: it reducing incidents, reduction in one-call burden, increased understanding of system failure modes. faster mean time to detection of SEVs.

Chaos engineering experiments in the following order:

  1. known knowns: Things we are aware of and understand.
  2. known unknowns: Things we are aware of but don’t fully understand.
  3. unknown knowns: Things we understand but are not aware of.
  4. unknown unknowns: Things we are neither aware of nor understand.

Planning your first chaos experiments consist of the following steps:

  1. Creating a hypothesis.
  2. Measuring the impact.
  3. Have a rollback plan.
  4. Go fix it!
  5. Have fun!

Pretotyping

Introduction

In this report, we present three ideas that were tested by using the Pretotyping technique. Pretotyping is about trying a lot of ideas, translated into fast failures, increasing probabilities to find a successful idea. In this case, we began by discussing different ideas as a group and we filtered them through the question: “Would we use it?”.

Finally, when we answered yes to the three ideas we moved one to the next stage, working on the pretotyping. In the next sections, you will be able to see our hypothesis and next if they were correct or not.

Once the ideas were decided, the next thing to do was finding our pretotyping, in our case, we decided it was a good idea to create fake ads about the ideas and gather data about how many people would voluntarily click on the ad.

Here are the ads we used:

Courselet

Click here if you would like to track the price and get recommendations of courses on any platform (e. g. Coursera, edx, udemy, udacity…) in a single place without having to visit other platforms separately and be aware of the discounts in your courses.

https://bit.ly/3xEvXmn

Gastracker — Gas price tracker

Are you interested to know which fuel station in your city has the better prices so you can save some money?

Click here if you’d like to know the prices of your local fuel stations in real-time.

https://bit.ly/2RibyCU

Knowsher — Knowledge sharer

Would you like to use an app in which you can search for abilities that you want to learn in exchange for teaching something else? You don’t need to pay to learn, just to teach something else in return. In case you want it you can also pay a fee so you get access to it without having to teach something back.

https://bit.ly/3e5jb8Y

IDEA 1: Track and recommend course prices Courselet

Description
Track the price and receive recommendations of courses in any platform (Coursera, edX, Udacity, Udemy…) everything in just one place without needing to visit each platform independently, follow discounts, and more.

Hypothesis
Self-paced online learning and education have become popular in the last couple of years, and with the current problems due to the pandemic, people are interested in finding the best cost-benefit courses of all platforms in one single place. Everything from the comfort of their home.

Experiment description
To find out what people think about this idea we put to the test fake ads through bitly URLs with a specific description of the service. This way we had the opportunity to discover if people would click on the link to find out more about the service. This is how we would have a first approximation about how feasible was our idea.

We share this link with a specific group of people so we can know how many people of the total number of integrants of the group click on the fake ad.

Validation
We selected a group of people where we thought would be interested in the ideas. We share the link just as if it was a normal ad. 73% of the total number of people click on the link we share.

We can determine that people are interested in finding the best cost-benefit courses of all platforms in one single place. We cannot say that people are going to consume the product yet, but at least we can say they have an interest in it.

Results
From the data, we can determine that in fact, people are interested in finding their online courses in just one place and track prices to get discounts. Nevertheless, we aren’t sure yet that people would consistently consume our product. That’s why we should proceed with a new approach to gather more data, we could for example create a very simple prototype and see if we would use it. Then maybe another form could be to select a new and bigger group of people and allow them to test a version of the service.

IDEA 2: Gas price tracker app

Description
This app will track the prices of different gas stations in the city, then give you the best recommendation according to the distance and price. This app will help you save money the next time that you put gas in your car.

Hypothesis
The prices of the gas differ from each gas station. Searching for the best price of the gas is nowadays an interest for the people.

Experiment description
We shared a bitly link to find out if the idea was interesting and useful for the people. If the persons take their time to click on the link, we could know that they are interested in acquiring the service. This was our idea to test and confirm if our idea has true potential for the current market.

The link with a brief description of the service was shared with a specific group of young people, so we could know what percentage of the people were interested in clicking on it.

Validation
We select a group of target people who could be interested in our app related to the prices of the gas stations. 73% of persons of the groups showed interest in the application, clicking on the link.

According to this result, we can determine that people are interested in getting lower gas prices. We don’t know the true potential of our product, but at least the premise is enough to get the genuine interest of the people.

Results
With this data, we know that the idea is strong enough to keep going. The next step is to prove which is the right interface and user experience to keep the attention of the people. Creating a low-budget prototype and doing the same type of experiment will help us to get the right approach to the product.

IDEA 3: Knowledge exchange app

Hypothesis
Learning is something that all people constantly seek. To be better people and better professionals, we must be constantly learning. To get a better job we must learn new things.

During the Covid-19 pandemic, many people have been affected. Many have lost their jobs, and for those who did not have it, it has been much more difficult to find one.

The best way to keep a job, get one, or even get a better one is to constantly gain knowledge. Learning is the only thing that can assure us that at some point we can succeed.

The problem is that acquiring knowledge is not usually something accessible to all people, much less during a global crisis such as that caused by Covid-19. That is why many people during this crisis, even when they were willing to dedicate time to learn, could not do so because they did not have the way to access quality education.

But it is also true that all people have something they can teach. All people have knowledge that they can share. That is why an application that allows the exchange of knowledge could be very helpful for those people who want to learn something new and do not have the resources to do so.

The most important resources we have are our time and our knowledge, that is why we can share them with someone else, and in exchange for that other people will be able to share their knowledge and time with us.

Experiment Description
To test whether people were interested in using an application in which they could exchange knowledge, we decided to share a link with the following description in several groups:

“Would you like to be able to use an app in which you can look for skills that you want to learn in exchange for you teaching others? You do not need to pay to learn just to teach something to others in return. If you want, you can pay to have access without needing to teach anything”.

We think that if a person clicked on the link after reading the description, they did so because they were interested in using the application or at least knowing a little more about it.

Validation
As with the rest of our experiments we sent a link to some groups with the premise of having access to the app, as what would be an example of it. The link we used was a tool that let us verify how many people clicked on it after some time. In this case, we were able to verify in a matter of hours the interest people had in our app. While our validation of the data can’t tell how it would perform in a bigger population, it gives us fast and reliable information and feedback whether people are interested in the product as to click right away in hopes of finding the mentioned app, and this simple and fast process complies with the mindset of pretotyping, getting fast and concise feedback about your product.

Results
We verified that at least 73% of the people we worked with for this study were interested in the idea of the knowledge exchange app since that is the percentage that clicked on the link. Of course, we could complement this information or increase the resources used to verify how the app could be tested in a more public way, though in this case there were some barriers as the pandemic which unfortunately stops us from being able to do an in-person study. Still, certain tests could be applied to see how people interacted with the app, but of course, there would need to be an implementation of it already and in this occasion, we just wanted to see how interesting it would be for a certain public, so we could fail or succeed quickly, to move onto the next step.

Conclusions
The experiments went well, nevertheless, it is necessary to keep working on the following pretotypes and in larger groups of people to see how the idea evolves through gathering data. Once we get the idea that we want it would be a good idea to work on prototypes and then work on the right “it”. We need to keep working and failing fast.

A good resource:

https://www.pretotyping.org/

In conclusion, this week I learned about the importance of testing and the scientific method. Testing is fundamental for enormous companies as Facebook, Google, and Netflix. Knowing how the companies do it means improving my perspective about testing. Also I learned that constant feedback can make the difference. There nothing to do with talents, just a change of mindset can make enormous improvements.

--

--

Carlos Marín
Carlos Marín

No responses yet