Welcome to pyfair’s documentation!

Generally

Overview

If you are already familiar with FAIR, please skip to Getting Started With pyfair.

Factor Analysis for Information Risk (FAIR) is a methodology for analyzing cybersecurity risk. In a general sense, it works by breaking risk into its individual components. These components can then be measured or estimated numerically, allowing for a quantitative calculation of risk as a whole.

Note

“Risk” in FAIR is defined as the total dollar amount of expected loss for a given time frame. If you come from a traditional risk management background, you likely refer to this as the more commonly accepted term “Loss Exposure”. This documentation will use the FAIR nomenclature.

The actual calculation for Risk often takes the form of a Monte Carlo method. This Monte Carlo method supplies random inputs for our model. The model then transforms the inputs in accordance with FAIR calculation rules, and provides outputs. The outputs can then be analyzed to determine what the potential range of Risk values are. pyfair’s purpose is to simplify and automate these Monte Carlo methods.

A Quick Monte Carlo Example

Generally speaking, Monte Carlo experiments are a class of techniques that solve problems using random sampling. Within the context of FAIR they are used to estimate loss by performing calculations on random inputs. This is a brief demonstration of how you can use a Monte Carlo method without knowing anything about FAIR.

Say we know the average height and average weight a certain population looks like, but we don’t know what the average Body Mass Index (BMI) looks like:

_images/before_monte_carlo_bmi.png

We can use the weight and height distributions from the data we do know to randomly generate 3 height samples and 3 weights samples.

Sample

Weight (kg)

Height (cm)

1

41

107

2

52

139

3

85

131

We take these generated samples, and for each For each of these samples, we calculate the samples’ BMI using the following formula:

\[\text{BMI} = \frac {\text{Weight}_{kg}} {(\text{Height}_{cm} \times .01) ^2}\]

Sample

Weight (kg)

Height (cm)

BMI

1

41

107

36

2

52

139

27

3

85

131

50

Once we have these BMIs, we can calculate the mean and spread of these BMIs. With 3 samples, this doesn’t give us a lot of data, but if we were to run 10,000 or so samples, we would get a distribution like this:

_images/after_monte_carlo_bmi.png

Most Monte Carlo simulations follow a similar process. We generate random inputs in accordance with a particular distribution, and we then run these inputs through complex or arbitrary formulae we cannot analyze otherwise. The output can then be used to infer what an expected output population looks like.

Nodes

Risk in FAIR (and by extension Risk in our Monte Carlo simulation) is broken down into a series of what pyfair calls “nodes” for calculation. The user supplies two or more of these nodes to generate random data, which in turn will allow us to calculate the mean, max, min, and standard deviation of the Risk and other nodes.

Note

While we refer to the data in these nodes, it is important to remember that we are talking about a single simulation within the Monte Carlo model. For example, if we have 1,000 simulations, we will have a vector of 1,000 elements. This will become more clear in the FAIR by Example section.

The nodes are as follows:

_images/calculation_functions.png

One of the benefits of FAIR is the flexibility that comes with being able to pick and choose the data you supply. If you want to supply Loss Event Frequency, and Loss Magnitude, you can do that.

_images/lef_and_lm_example.png

If you want to supply Threat Event Frequency, Threat Capability, Control Strength, Primary Loss, and Secondary Loss, you can do that to.

_images/tef_tc_cs_pl_and_sl_example.png

As you can likely see from the above examples, you only need to supply the bare minimum to complete the calculation. The general rule with pyfair is that to properly calculate any node, the node’s child nodes need to either be calculable or supplied.

FAIR by Example

This is a quick example of how one might conduct a FAIR calculation by hand. You will likely never need to do this, but it does provide a concrete example of how everything works.

For the purposes of this demonstration, we will keep it simple. We will run a Monte Carlo model composed of three separate simulations and using three inputs. These inputs will be Threat Event Frequency (TEF), Vulnerability (V), and Loss Magnitude (LM). We will use this simulation to estimate the Risk associated with allowing all ports to remain open.

The general approach will be as follows:

_images/fair_by_example_with_numbers.png
  • Step 1: Generate random values to supply TEF, V, and LM

  • Step 2: Multiply your TEF and V values to calculate LEF

  • Step 3: Multiply your LEF and LM to calculate Risk

  • Step 4: Analyze your Risk outputs

Step 1: Generate Our Random Inputs

We start by generating our data. We will generate 3 values for Threat Event Frequency (TEF), 3 values for Vulnerability (V), and 3 values for Loss Magnitude (LM). Most often in FAIR you will see BetaPert distributed random variates. For the sake of simplicity this example will use normally distributed random variates.

First we will estimate TEF. Recall that TEF is the number of threats that occur whether or not it result in a loss (which is represented by a positive number). Here we estimate that if leave these ports open, we will see around 50,000 attempted intrusions with a standard deviation of 10,000 events. We will then generate three normally distributed random numbers from a curve with a mean of 50,000 and a standard deviation of 10,000.

Mean

Standard Deviation

50,000

10,000

Generates random TEF values

Simulation

TEF

1

53,091

2

38,759

3

44,665

Second we will estimate our Vulnerability. Recall that V is the probability of whether a loss occurs.

Mean

Standard Deviation

.67

.01

Generates random V values

Simulation

V

1

.66

2

.67

3

.68

Third we will estimate our loss magnitude. Recall that LM is the amount of loss for each Loss Event (represented by a positive number). We estimate that on average each loss will result in an average of a $100 loss, with a standard deviation of $50. We then generate three normally distributed random numbers from a curve with a mean of 100 and a standard deviation of 50.

Mean

Standard Deviation

100

50

Generates random LM values

Simulation

LM

1

198

2

150

3

86

Step 2: Calculate LEF Using TEF and V

As you can likely see, we can use our 3 TEFs and 3 Vs and multiply them together element-by-element. This will give us 3 LEF values.

Simulation

TEF

V

LEF (TEF times V)

1

53,091

.66

35,040

2

38,759

.67

25,968

3

44,665

.68

30,372

This follows with what we know about Loss Event Frequency. It is the number of threat events that convert to loss events.

Step 3: Calculate Our R Using LEF and LM

Now that we have an LEF and an LM, we can calculate our final Risk, R. R is calculated by taking the total number of loss events and multiplying them by the amount lost for each event.

Simulation

LEF

LM

R (LEF times LM)

1

35,040

198

6,937,920

2

25,968

150

3,895,200

3

30,372

86

2,611,992

Step 4: Analyze Our Risk Outputs

By using our random inputs and putting them through our Monte Carlo model we were able to calculate Risk for three simulations. The resulting Risk from these simulations is $6,937,920, $3,895,200, and $2,612,009. Now that we have conducted our simulation we’ve learned that with our estimates we can expect our Risk to have the following attributes:

Risk Mean

Risk Standard Deviation

$4,481,709

$2,221,794

pyfair, as you will see later on, makes this considerably easier. You should be able to achieve similar results with 5 to 10 lines of code.

from pyfair import FairModel


# Create our model and calculate (don't worry about understanding yet)
model = FairModel(name='Sample')
model.input_data('Threat Event Frequency', mean=50_000, stdev=10_000)
model.input_data('Vulnerability', mean=.67, stdev=.01)
model.input_data('Loss Magnitude', mean=100, stdev=50)
model.calculate_all()
_images/calculation_example.png

Getting Started With pyfair

Usage

This section relates to how to use pyfair.

In general you will supply your inputs, calculate your model, and then do something with the data (e.g. store it, create a report, or feed it into another calculation).

Here is how you can use these pyfair tools to do that.

FairModel

The most basic element of PyFair is the FairModel. This FairModel is used to create basic Monte Carlo simulations as follows:

from pyfair import FairModel


# Create our model
model = FairModel(name='Basic Model', n_simulations=10_000)

# Add normally distributed data
model.input_data('Loss Event Frequency', mean=.3, stdev=.1)

# Add constant data
model.input_data('Loss Magnitude', constant=5_000_000)

# We could hypothetically do BetaPert data
# model.input_data('Loss Magnitude', low=0, mode=10, high=100, gamma=90)

# Run our simulations
model.calculate_all()

# Export results (if desired)
results = model.export_results()

To reiterate what we did: first, we created a model object for us to use with a name of “Basic Model” and composed of 10,000 simulations. We then supplied the “Loss Event Frequency” node with 10,000 normally distributed random data values, and provided 10,000 entries into “Loss Magnitude” of 5,000,000. We then run the calculations for the simulation by running calculate_all(), after which we can export the results or examine the object however we wish.

Note

pyfair uses pandas heavily for data manipulation, and consequently your results will be exported as easy-to-manipulate DataFrames unless otherwise specified.

While there are various ways to create these models (from serialized JSON models, from a database, uploading groups of parameters at the same time) … the general approach will almost always be the same. You will create the model, you will input your data, and you will calculate your model before using the results.

pyfair will take care of most of the “under the hood” unpleasantness associated with the Monte Carlo generation and FAIR calculation. You simply supply the targets and the distribution types. These targets are:

  • BetaPert: low, mode, and high (and optionally gamma)

  • Constant: constant

  • Normal: mean, stdev

Warning

You cannot mix these parameters. If you give a function a “constant” parameter, a “low” parameter, and a “mean” parameter, it will throw an error.

If you don’t supply the right nodes to create a proper calculation, pyfair will tell you what you’re missing. If you don’t supply the right arguments, pyfair will tell you. Et cetera, et cetera, et cetera.

FairMetaModel

At times you will likely want to determine what the total amount of risk is for a number of FairModels. Rolling these model risks up into a single unit is what the FairMetaModel does. These can be created in a number of ways, but most generally you will simply feed a list of FairModels to a FairMetaModel constructor like this:

from pyfair import FairModel, FairMetaModel


# Create a model
model1 = FairModel(name='Risk Type 1', n_simulations=10_000)
model1.input_data('Loss Event Frequency', mean=.3, stdev=.1)
model1.input_data('Loss Magnitude', constant=5_000_000)

# Create another model
model2 = FairModel(name='Risk Type 2', n_simulations=10_000)
model2.input_data('Loss Event Frequency', mean=.3, stdev=.1)
model2.input_data('Loss Magnitude', low=0, mode=10_000_000, high=20_000_000)

# Create our metamodel
metamodel = FairMetaModel(name='Our MetaModel', models=[model1, model2])

# Calculate our MetaModel (and contained Models)
metamodel.calculate_all()

# Export results
metamodel.export_results()

Again, the general workflow is the same. We create our metamodel, we calculate our data, and we export the results.

FairModelFactory

Related to the metamodel is the FairModelFactory object. Often you will want to create a group of models that are identical except for one or two minor differences. For example, if you want to create a model for an entire threat community, you may wish to create a model for “Threat Group 1”, “Threat Group 2”, and “Threat Group 3” before aggregating the risk into a single metamodel. FairModelFactory allows this by taking the parameters that will not change, and then putting in a list of the parameters that will change. An example is below:

from pyfair import FairMetaModel, FairModelFactory


# Instantiate factory
factory = FairModelFactory({'Loss Magnitude': {'constant': 5_000_000}})

# Create 3 models with variable arguments
state_actor = factory.generate_from_partial(
    'Nation State',
    {'Threat Event Frequency': {'mean': 50, 'stdev': 5}, 'Vulnerability': {'constant': .95}}
)
hacktivist = factory.generate_from_partial(
    'Hactivist',
    {'Threat Event Frequency': {'mean': 5_000, 'stdev': 10}, 'Vulnerability': {'constant': .25}}
)
id_thief = factory.generate_from_partial(
    'Identity Thief',
    {'Threat Event Frequency': {'mean': 500, 'stdev': 100}, 'Vulnerability': {'constant': .75}}
)

# Create a metamodel
meta = FairMetaModel('Aggregate', [state_actor, hacktivist, id_thief])
meta.calculate_all()
results = meta.export_results()

FairSimpleReport

The FairSimpleReport is a mechanism to create a basic HTML-based report. It can take Models, MetaModels, or a list of Models and MetaModels like so:

from pyfair import FairModel, FairSimpleReport


# Create a model
model1 = FairModel(name='Risk Type 1', n_simulations=10_000)
model1.input_data('Loss Event Frequency', mean=.3, stdev=.1)
model1.input_data('Loss Magnitude', constant=5_000_000)
model1.calculate_all()

# Create another model
model2 = FairModel(name='Risk Type 2', n_simulations=10_000)
model2.input_data('Loss Event Frequency', mean=.3, stdev=.1)
model2.input_data('Loss Magnitude', low=0, mode=10_000_000, high=20_000_000)
model2.calculate_all()

# Create a report and write it to an output.
fsr = FairSimpleReport([model1, model2])
fsr.to_html('output.html')

As a general rule, if you want to add things together, use a MetaModel and pass it to the report. If you want to compare two things, pass a list of the two things to the report. Simply create the report, and then output the report to an HTML document.

FairDatabase

The FairDatabase object exists in order to store models so that they can be loaded at a later date. For the sake of space, pyfair does not store all model results. Rather it stores parameters for simulations, which are run anew each time. Though because the random seeds for your random number generation stay the same, your results will be reproducible. This works as follows:

from pyfair import FairModel, FairDatabase


# Create a model
model = FairModel('2019 Simulation')
model.bulk_import_data({
    'Loss Event Frequency': {'mean':.3, 'stdev':.1},
    'Loss Magnitude': {'constant': 5_000_000}
})
model.calculate_all()

# Create a database file and store a model
db = FairDatabase('pyfair.sqlite3')
db.store(model)

# Load a model
reconstituted_model = db.load('2019 Simulation')
reconstituted_model.calculate_all()

Frequently Asked Questions (FAQs)

Why do the parameters I use throw errors?

Because of the structure of the FAIR process, it is not possible to use each and every argument type and value. Here are some of the common problems:

Value Range

General rules:

  • No argument can be less than 0

The following nodes must have values from 0 to 1:

  • TC: Threat Capability

  • CS: Control Strength

  • A: Probability of Action

  • V: Vulnerability

Pert distributions:

  • High parameter must be equal to or greater than Mode parameter

  • Mode parameter must be equal to or greater than Low parameter

Parameter Mismatch

Keywords must be used as follows:

  • constant: must be the only parameter used for a given node

  • low, mode, and high: must be used together (gamma is optional)

  • mean, stdev: must be used together

Why are my calculation dependencies unresolved?

pyfair uses the following structure for calculations:

_images/calculation_functions.png

As you can see, this takes the form of tree composed of nodes. A the bottom there are “leaf” nodes. These nodes can only be supplied with data and cannot be calculated from other values. At the top there is the “root” node representing a dollar value for Risk. It can only be calculated (after all, that is the point of the FAIR exercise). In the middle, we have “branch” nodes. These nodes can either be supplied with values, or calculated if both of the items beneath it have been supplied or calculated. By extension, that means that you need not supply any information on nodes that fall underneath.

This is clearer when looking at an example. Say you run the following code:

from pyfair import FairModel


# Create an incomplete model
model = FairModel('Tree Test')
model.input_data('Loss Event Frequency', mean=5, stdev=1)
model.calculate_all()

Your code will raise this error:

FairException: Not ready for calculation. See statuses:
Risk                                  Required
Loss Event Frequency                  Supplied
Threat Event Frequency            Not Required
Contact Frequency                 Not Required
Probability of Action             Not Required
Vulnerability                     Not Required
Control Strength                  Not Required
Threat Capability                 Not Required
Loss Magnitude                        Required
Primary Loss                          Required
Secondary Loss                        Required
Secondary Loss Event Frequency        Required
Secondary Loss Event Magnitude        Required

The reason for this is readily apparent when looking at the calculation tree:

_images/incomplete_example.png

As you can see, you supplied “Loss Event Frequency”. That means you do not need to calculate “Loss Event Frequency”–and you also don’t have to deal with anything underneath it because it’s all superfluous. That said, you cannot calculate Risk because the whole right side of the FAIR calculation hasn’t been supplied.

If you were create a new model with “Loss Magnitude” and “Loss Event Frequency” you’d cover both branches of the FAIR model and would receive no error. Notice that you did not have to supply information for everything in the error above. pyfair lists them all as required because it has no idea what you’re going to put in next (and so it doesn’t know whether it will be high on the tree or low on the tree).

Why do my simulation results change from run to run?

Monte Carlo simulations are an attempt to harness large numbers of random simulations to model complex outcomes. pyfair seeds its random number generation with a so-called “random seed”. This makes the outcome, While quasi-random and suitable for modeling, actually deterministic in fact. As a consequence, we can run a pyfair simulation today and a simulation tomorrow, and they will come out the same if the parameters are the same.

By default, the random seed is 42. If you’re reading this, you’ve probably changed the random seed.

Calculation Details

The nodes can be described as follows:

Risk (“R”)

Description

A vector of currency values/elements, which represent the ultimate loss for a given time period

Restrictions

All elements must be positive

Derivation

Multiply the Loss Event Frequency vector by the Loss Magnitude vector

\[\begin{split}\begin{bmatrix} \text{R}_{1} \\ \text{R}_{2} \\ \vdots \\ \text{R}_{m} \end{bmatrix} = \begin{bmatrix} \text{LEF}_{1} \\ \text{LEF}_{2} \\ \vdots \\ \text{LEF}_{m} \end{bmatrix} \times \begin{bmatrix} \text{LM}_{1} \\ \text{LM}_{2} \\ \vdots \\ \text{LM}_{m} \end{bmatrix}\end{split}\]

Example

For a given year, if we have the number of times a particular event occurs (Loss Event Frequency/LEF) and the dollar losses associated with each of those events (Loss Magnitude/LM), we can multiply these together to derive the ultimate dollar value amount lost (Risk/R).

Simulation

LEF

LM

R (LEF x LM)

1

100

$1,000

$100,000

2

100

$2,000

$200,000

3

200

$3,000

$600,000

Loss Event Frequency (“LEF”)

Description

A vector of elements which represent the number of times a particular loss occurs during a given time frame (generally one year)

Restrictions

All elements must be positive

Derivation

Supplied directly, or multiply the Threat Event Frequency vector by the Vulnerability vector

\[\begin{split}\begin{bmatrix} \text{LEF}_{1} \\ \text{LEF}_{2} \\ \vdots \\ \text{LEF}_{m} \end{bmatrix} = \begin{bmatrix} \text{TEF}_{1} \\ \text{TEF}_{2} \\ \vdots \\ \text{TEF}_{m} \end{bmatrix} \times \begin{bmatrix} \text{V}_{1} \\ \text{V}_{2} \\ \vdots \\ \text{V}_{m} \end{bmatrix}\end{split}\]

Example

For a given year, if we have the number of times a particular threat occurs (Threat Event Frequency/TEF), and the percentage of times we can expect that threat to turn into a loss (Vulnerability/V), we can multiply these together to derive the number of losses that will occur (Loss Event Frequency/LEF).

Simulation

TEF

V

LEF (TEF x V)

1

0.50

0.50

0.25

2

200

0.25

50

3

300

1.00

300

Note

Though intended to represent a discrete number of events, TEF and LEF are not rounded to the nearest integer. This allows for the modeling of events that happen infrequently. For instance, if we are running a simulation for a single year, one might model a once a century occurrence using a LEF/TEF of 0.01.

Threat Event Frequency (“TEF”)

Description

A vector of elements representing the number of times a particular threat occurs, whether or not it results in a loss

Restrictions

All elements must be positive

Derivation

Supplied directly, or multiply the Contact Frequency vector and the Probability of Action vector

\[\begin{split}\begin{bmatrix} \text{TEF}_{1} \\ \text{TEF}_{2} \\ \vdots \\ \text{TEF}_{m} \end{bmatrix} = \begin{bmatrix} \text{C}_{1} \\ \text{C}_{2} \\ \vdots \\ \text{C}_{m} \end{bmatrix} \times \begin{bmatrix} \text{A}_{1} \\ \text{A}_{2} \\ \vdots \\ \text{A}_{m} \end{bmatrix}\end{split}\]

Example

For a given year, if we have the number of times an actor comes in contact with an asset (Contact Frequency/C), and the probability that the actor will attempt to act of that contact (Probability of Action, P), we can multiply these together to derive the number of times that a particular threat will occur (Threat Event Frequency/TEF)

Simulation

C

A

TEF (C x A)

1

1,000

0.50

500

2

2,000

0.25

500

3

3,000

1.00

3,000

Vulnerability (“V”)

Description

A vector of elements with each value representing the probability that a potential threat actually results in a loss

Restrictions

All elements must be from 0.0 to 1.0

Derivation

Supplied directly, or via the following operation:

\[\begin{split}\bar{V} \; \text{Where} \; V_{i} = \begin{cases} 1, & \text{if} \; \text{TC}_{i} \; \geq \text{CS}_{i}\\ 0, & \text{if} \; \text{TC}_{i} \; \lt \text{CS}_{i}\\ \end{cases}\end{split}\]

Or in more concrete terms, we have a vector of Threat Capabilities and a vector of Control Strengths. For each element of the vector, we determine if Threat Capability is greater than Control Strength. In other words, 1 is where the threat overwhelms the control, and 0 is where the control withstands the threat.

\[\begin{split}\text{TC} = \begin{bmatrix} 0.60 \\ 0.70 \\ 0.10 \\ \end{bmatrix} \quad \text{CS} = \begin{bmatrix} 0.55 \\ 0.65 \\ 0.75 \\ \end{bmatrix} \quad \overrightarrow{Indicator Function} \quad \text{Intermediate} = \begin{bmatrix} 1 \\ 1 \\ 0 \\ \end{bmatrix}\end{split}\]

We then analyze this intermediate array of ones and zeros, and obtain its average. The represents the percent of times in our simulations that the threat overcame the control.

\[\begin{split}\text{Intermediate} = \begin{bmatrix} 1 \\ 1 \\ 0 \\ \end{bmatrix} \quad \overrightarrow{Average} \quad \frac {(1 + 1 + 0)} {3} = 0.66\end{split}\]

This scalar is then assigned to a vector for the sake of computational consistency.

\[\begin{split}\text{V} = \begin{bmatrix} 0.66 \\ 0.66 \\ 0.66 \\ \end{bmatrix}\end{split}\]

Example

For a given year, if we have the relative strengths of attackers (Threat Capability/TC) and the relative strengths of our controls (Control Strength/CS), we can run a step function and then average the result to obtain a percentage of times we expect a threat to overcome a control (Vulnerability/V).

Simulation

TC

CS

V

1

0.60

0.50

0.33

2

0.10

0.50

0.33

3

0.30

0.40

0.33

Note

For the purposes of this calculation, TC must be estimated relative to CS, and CS must be estimated relative to TC. They are essentially just rough guesses to determine the percentage of threats that will fail or succeed (and consequently have no independent meaning apart from each other).

Contact Frequency (“C”)

Description

A vector with elements representing the number of threat actor contacts that could potentially yield a threat within a given timeframe

Restrictions

All elements must be a positive number

Derivation

None (this must be supplied, not calculated)

Example

For a given year, the number of contacts that can potentially yield an attack, and in turn can potentially yield a loss (Contact Frequency/C).

Simulation

C

1

5,000,000

2

3,000,000

3

2,500,000

Probability of Action (“A”)

Description

A vector with elements representing the probability that a threat actor will proceed after coming into contact with an organization

Restrictions

All elements must be number from 0.0 to 1.0

Derivation

None (this must be supplied, not calculated)

Example

The probability that a contact results in action being taken against a resource (Probability of Action/P)

Simulation

P

1

0.95

2

0.90

3

0.80

Threat Capability (“TC”)

Description

A vector of unitless elements that describe the relative level of expertise and resources of a threat actor (relative to a Control Strength)

Restrictions

All elements must be number from 0.0 to 1.0

Derivation

None (this must be supplied, not calculated)

Example

The relative strength of a threat actor (Threat Capability/C) as it relates to the relative strength of the controls (Control Strength/CS)

Simulation

TC

1

0.75

2

0.60

3

0.70

Control Strength (“CS”)

Description

A vector of unitless elements that describe the relative strength of a given control (relative to the Threat Capability of a given actor)

Restrictions

All elements must be a number from 0.0 to 1.0

Derivation

None (this must be supplied, not calculated)

Example

The relative strength of a set of controls (Control Strength/CS) as it relates to the relative strength of a threat actor (Threat Capability/TC)

Simulation

TC

1

0.15

2

0.10

3

0.05

Loss Magnitude (“LM”)

Description

A vector of currency values describing the total loss for a single Loss Event

Restrictions

All elements must be positive

Derivation

Supplied directly, or the sum of the Primary Loss vector and Secondary Loss vector

\[\begin{split}\begin{bmatrix} \text{LM}_{1} \\ \text{LM}_{2} \\ \vdots \\ \text{LM}_{m} \end{bmatrix} = \begin{bmatrix} \text{PL}_{1} \\ \text{PL}_{2} \\ \vdots \\ \text{PL}_{m} \end{bmatrix} + \begin{bmatrix} \text{SL}_{1} \\ \text{SL}_{2} \\ \vdots \\ \text{SL}_{m} \end{bmatrix}\end{split}\]

Example

For a given loss, if we have the total dollar amount of a primary loss (Primary Loss/PL), and the total dollar amount of a secondary loss (Secondary Loss/SL), we can obtain the total amount (Loss Magnitude/LM) by adding PL and SL.

Simulation

PL

SL

LM (PL + SL)

1

$120

$80

$200

2

$210

$5

$215

3

$200

$60

$260

Primary Loss (“PL”)

Description

A vector of currency losses directly attributable to the threat

Restrictions

All elements must be positive

Derivation

None (this must be supplied, not calculated)

Example

The amount of the loss directly attributable to the threat (Primary Loss/PL)

Simulation

PL

1

$5,000,000

2

$3,500,000

3

$2,500,000

Secondary Loss (“SL”)

Description

A vector of currency losses attributable to secondary factors

Restrictions

All elements must be positive

Derivation

Supplied directly, or the rowwise sum of 1) the Secondary Loss Event Frequency vector and 2) the Secondary Loss Event Magnitude vector multiplied together on an elementwise basis.

\[\begin{split}\begin{bmatrix} \text{SL}_{1} \\ \text{SL}_{1} \\ \vdots \\ \text{SL}_{1} \\ \end{bmatrix} \quad = \quad \sum\limits^n_{j=1} \quad \left( \quad \begin{bmatrix} \text{SLEF}_{1,1} & \text{SLEF}_{1,2} & \dots & \text{SLEF}_{1,n} \\ \text{SLEF}_{2,1} & \text{SLEF}_{2,2} & \dots & \text{SLEF}_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ \text{SLEF}_{m,1} & \text{SLEF}_{m,2} & \dots & \text{SLEF}_{m,n} \\ \end{bmatrix} \quad \circ \quad \begin{bmatrix} \text{SLEM}_{1,1} & \text{SLEM}_{1,2} & \dots & \text{SLEM}_{1,n} \\ \text{SLEM}_{2,1} & \text{SLEM}_{2,2} & \dots & \text{SLEM}_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ \text{SLEM}_{m,1} & \text{SLEM}_{m,2} & \dots & \text{SLEM}_{m,n} \\ \end{bmatrix} \quad \right)\end{split}\]

Example

For a given model, we can have a matrix of secondary loss probabilities. Each row can represent a simulation and each column can represent a loss type. In this example below we have three different probability columns for different types of probability loss. E.g. the probabilities of loss for simulation 1 are 0.95, 0.05, and 1.00.

Simulation

Prob Loss A

Prob Loss B

Prob Loss C

1

0.95

0.05

1.00

2

0.90

0.10

1.00

3

0.50

0.10

0.80

For a given model, we can also have the dollar amounts associated with these individual loss types.

Simulation

$ Loss A

$ Loss B

$ Loss C

1

$1,000

$100

$50

2

$2,000

$50

$90

3

$1,500

$30

$25

This allows us to match up these matrices on an element-by-element basis and say something like:

Cell 1A from table 1 is 0.95 and cell 1A from table 2 is $1,000. Multiplying (Sim 1, Prob Loss A) by (Sim 1, $ Loss A) yields $950. We can put this result in table 3.

Simulation

Secondary Loss A

1

$950

If we do this for every cell in tables 1 and 2, we can a new table that has the secondary losses for each loss type and each simulation.

Simulation

SL (A)

SL (B)

SL (C)

1

$950

$5

$50

2

$1,800

$5

$90

3

$750

$3

$20

Finally, it is possible to add up each row to get the total amount of Secondary Loss for a given simulation. This Secondary Loss vector can then be added to the Primary Loss vector to do further calculations.

Simulation

Total Secondary Loss

1

$1,005

2

$1,895

3

$773

Secondary Loss Event Frequency (“SLEF”)

Description

A matrix of probabilities with each row representing a single simulation, and each column represents the probability that a particular secondary loss type will occur

Restrictions

All matrix elements must be number from 0.0 to 1.0

Derivation

None (this must be supplied, not calculated)

Example

For a given model, you may have three simulations and three separate different loss types. This would give you three different probabilities for each simulation, and three different simulations for each probability type.

Simulation

Prob Loss A

Prob Loss B

Prob Loss C

1

0.95

0.05

1.00

2

0.90

0.10

1.00

3

0.50

0.10

0.80

Secondary Loss Event Magnitude (“SLEM”)

Description

A matrix of currency amounts with each row representing a single simulation, and each column represents the the amount of loss for amount particular loss type

Restrictions

All matrix elements must be positive

Derivation

None (this must be supplied, not calculated)

Example

For a given model, you may have three simulations and three separate different loss types. This would give you three different dollar amounts for each simulation, and three different simulations for each dollar amount type.

Simulation

$ Loss A

$ Loss B

$ Loss C

1

$1,000

$100

$50

2

$2,000

$50

$90

3

$1,500

$30

$25

Classes

Model Classes

FairModel

FairMetaModel

FairCalculations

FairDataInput

FairDependencyNode

class pyfair.model.model_node.FairDependencyNode(name)

Represents the status of a given calculation for FairDependencyTree

FairModel has a captive FairDependencyTree, and a FairDependencyTree is made of FairDependencyNodes. It is a simple structure that holds a status tag, and related nodes to allow for traversing the tree stucture.

Parameters:

name (str) – A human-readable designation for identification.

name

A human-readable designation for identification.

Type:

str

parent

The single node immediately above the current node in the tree (default is None).

Type:

pyfair.model.FairDependencyNode

children

A list of the child nodes below the node in the tree (default is an empty list).

Type:

list

status

‘Calculated’} An identifier that gives the status of the node (default is ‘Required’).

Type:

{‘Required’, ‘Not Required’, ‘Supplied’, ‘Calculable’,

add_child(child)

Add a child to an individual node (orchestated by tree)

Parameters:

child (pyfair.model.FairDependencyNode) – A node to attach as a child

add_parent(parent)

Add a parent to an individual node (orchestated by tree)

Parameters:

parent (pyfair.model.FairDependencyNode) – A node to attach as a parent

FairDependencyTree

class pyfair.model.model_tree.FairDependencyTree

A captive class tracking FAIR calculation dependencies.

An instance of this class is created when a FairModel is instantiated. It is used to during the lifetime of the FairModel to track what data has been supplied to the model. Consequently it can be used to determine what further data is needed and what calculations can be performed.

It is created from a group of nodes of type FairDependencyNode. When data is supplied, information is propogated down the tree and then up the tree when performing calculations. On its own, it is pretty dumb. The FairModel tells the tree what to do.

nodes

A dict with name string key and a FairDependencyNode value

Type:

dict

Notes

http://pubs.opengroup.org/onlinepubs/9699919899/toc.pdf

calculation_completed()

Determine whether the model has been completed

Returns:

True if the calculation is complete, otherwise False

Return type:

bool

get_node_statuses()

Simple getter to obtain node statuses.

Returns:

A dict with keys of node names and values of node statuses

Return type:

dict

ready_for_calculation()

Ensure there are no required items remaining

Returns:

True if model is ready for calculation, otherwise False

Return type:

bool

update_status(node_name, new_status)

Notify node that data was provided

This function notifies the node that the status has changed and then propogates data down the tree as necessary. For example, if data is supplied for a node, then all nodes underneath it are no longer required. The nodes are recursively updated to the bottom of the tree.

In addition, information is also propgated up the tree. For example, if two nodes are calculabe (in other words, can be calculated), they can be calculated. In addition, if the calculation makes it possible for other nodes to be calculated, those will be calculated up the tree.

Parameters:
  • node_name (str) – The node in self.nodes that will be updated

  • new_status (str) – The new status with which to update the node

Example

>>> # usually captive, but we're instantiating here
>>> tree = FairDependencyTree()
>>> # This will update nodes below (and above)
>>> tree.update_status('Loss Event Frequency', 'Supplied')
>>> tree.update_status('Loss Magnitude', 'Supplied')

Report Classes

FairSimpleReport

FairBaseReport

FairBaseCurve

FairDistributionCurve

FairExceedenceCurves

FairTreeGraph

FairViolinPlot

Utility Classes

FairException

FairBetaPert

FairModelFactory

FairDatabase

Release Notes

Version 0.1-alpha.12

  • Data validation fixes for changes in 0.1-alpha.11

Version 0.1-alpha.11

  • Added support for different currency prefixes

  • Fixed errant abbreviations for input_data()

Version 0.1-alpha.10

  • Corrected inappropriate validation of gamma values

  • Added additional unit tests for FairDatabase

Version 0.1-alpha.9

  • Corrected erroneous Vulnerability calculation.

  • Updated links in README

Version 0.1-alpha.8

  • Fixed FairSimpleReport to allow for interactive generation.

Version 0.1-alpha.7

  • More descriptive error messages from FairSimpleReport

  • Documentation fixes

  • Fixes to FairSimpleReport (specifically SLEM)

  • Fix calculation_completed() to allow for directly input of Risk

Version 0.1-alpha.6

  • Fixed metadata of base report to auto-fetch names cross-platform.

  • Corrected erroneous statements in documentation related to Vulnerability.

Version 0.1-alpha.5

  • Added raw_input() function and associated storage routines.

  • Improved PEP8 compliance.

Version 0.1-alpha.4

  • Correct inappropriate Vulnerability calculation.

Version 0.1-alpha.3

  • Testing and documentation completed for the utility and report modules.

Version 0.1-alpha.2

  • Testing and documentation completed for the model module.

Version 0.1-alpha.1

  • Additional documentation for items in report and utility modules.

Version 0.1-alpha.0

  • Initial release of pyfair containing the foundational FairModel, FairMetaModel, FairSimpleReport, FairDatabase, and FairModelFactory classes.

Indices and tables