13. MTfit.inversion: Inversion object

The main MTfit inversion object is used to handle the forward model evaluation and process the results. The object initialisation and forward functions are documented here, although there are additional (private and special) functions that are documented in the source code.

class MTfit.inversion.Inversion(data={}, data_file=False, location_pdf_file_path=False, algorithm='iterate', parallel=True, n=0, phy_mem=8, dc=False, **kwargs)[source]

Main Inversion object

Runs the MT inversion, follwing a parameterisation set on initialisation. Parameters can be set for the algorithm to use in the kwargs.

Algorithm options are:

  • Time - Monte Carlo random sampling - runs until time limit reached.
  • Iterate - Monte Carlo random sampling - runs until sample limit reached.
  • McMC - Markov chain Monte Carlo sampling.
  • TransDMcMC - Markov chain Monte Carlo sampling.

These are discussed in more detail in the MTfit.algorithms documentation.

The inversion is run by calling the forward function (MTfit.inversion.Inversion.forward)

Data Format

The inversion expects a python dictionary of the data in the format:

>>data={'PPolarity':{'Measured':numpy.matrix([[-1],[-1]...]),
                     'Error':numpy.matrix([[0.01],[0.02],...]),
                     'Stations':{'Name':[Station1,Station2,...]
                                 'Azimuth':numpy.matrix([[248.0],[122.3]...]),
                                 'TakeOffAngle':numpy.matrix([[24.5],[2.8]...])
                                }
                 },
        'PSHAmplitudeRatio':{...},
        ...
        'UID':'Event1'
        }

The initial key arguments correspond to the data types that can be used in the inversion. The inversion uses polarity observations and amplitude ratios, and can also use relative amplitudes. This means that the useable data types are:

Polarity

  • PPolarity
  • SHPolarity
  • SVPolarity

Polarity observations made manually or automatically. The corresponding data dictionary for polarities needs the following keys:

  • Measured: numpy matrix of polarity observations

  • Error: numpy matrix of fractional uncertainty in the polarity observations.

  • Stations: dictionary of station information with keys:

    • Name: list of station names
    • Azimuth: numpy matrix of azimuth values in degrees
    • TakeOffAngle: numpy matrix of take off angle values in degrees - 0 down (NED coordinate system).

As with all of these dictionaries, the indexes of the observations and errors must correspond to the stations, i.e. data['Measured'][0,0] -> from data['Stations']['Name'][0] with error data['Error'][0,:] etc.

If polarity probabilities are being used, the keys are:

  • PPolarityProbability
  • SHPolarityProbability
  • SVPolarityProbability

With a similar structure to the Polarity data, except the measured matrix has an additional dimension, i.e. data['Measured'][0,0] is the positive polarity probability and data['Measured'][0,1] is the negative polarity probability.

Amplitude ratios

  • P/SHRMSAmplitudeRatio
  • P/SVRMSAmplitudeRatio
  • SH/SVRMSAmplitudeRatio
  • P/SHQRMSAmplitudeRatio
  • P/SVQRMSAmplitudeRatio
  • SH/SVQRMSAmplitudeRatio
  • P/SHAmplitudeRatio
  • P/SVAmplitudeRatio
  • SH/SVAmplitudeRatio
  • P/SHQAmplitudeRatio
  • P/SVQAmplitudeRatio
  • SH/SVQAmplitudeRatio

Amplitude ratio observations made manually or automatically. The Q is not necessary but is useful to label the amplitudes with a Q correction. The corresponding data dictionary for amplitude ratios needs the following keys:

  • Measured: numpy matrix of corrected numerator and denominator amplitude ratio observations, needs to have two columns, one for the numerator and one for the denominator.

  • Error: numpy matrix of uncertainty (standard deviation) in the amplitude ratio observations, needs to have two columns, one for the numerator and one for the denominator.

  • Stations: dictionary of station information with keys:

    • Name: list of station names
    • Azimuth: numpy matrix of azimuth values in degrees
    • TakeOffAngle: numpy matrix of take off angle values in degrees - 0 down (NED coordinate system).

As with all of these dictionaries, the indexes of the observations and errors must correspond to the stations, i.e. data['Measured'][0,0] -> from data['Stations']['Name'][0] with error data['Error'][0,:] etc.

Relative Amplitude Ratios

  • PAmplitude
  • SHAmplitude
  • SVAmplitude
  • PQAmplitude
  • SHQAmplitude
  • SVQAmplitude
  • PRMSAmplitude
  • SHRMSAmplitude
  • SVRMSAmplitude
  • PQRMSAmplitude
  • SHQRMSAmplitude
  • SVQRMSAmplitude

Relative Amplitude ratios use amplitude observations for different events made manually or automatically. The Q is not necessary but is useful to label the amplitudes with a Q correction. The corresponding data dictionary for amplitude ratios needs the following keys:

  • Measured: numpy matrix of amplitude observations for the event.

  • Error: numpy matrix of uncertainty (standard deviation) in the amplitude observations.

  • Stations: dictionary of station information with keys:

    • Name: list of station names
    • Azimuth: numpy matrix of azimuth values in degrees
    • TakeOffAngle: numpy matrix of take off angle values in degrees - 0 down (NED coordinate system).

As with all of these dictionaries, the indexes of the observations and errors must correspond to the stations, i.e. data['Measured'][0,0] -> from data['Stations']['Name'][0] with error data['Error'][0,:] etc.

Angle Scatter Format The angle scatter files can be generated using a utility Scat2Angle based on the NonLinLoc angle code. The angle scatter file is a text file with samples separated by blank lines. The expected format is for samples from the location PDF that have been converted into take off and azimuth angles (in degrees) for the stations, along with a probability value. It is important that the samples are drawn from the location PDF as a Monte Carlo based integration approach is used for marginalising over the uncertainty.

It is possible to use XYZ2Angle to samples drawn from a location PDF using the NonLinLoc angle approach.

Expected format is:

Probability
StationName Azimuth TakeOffAngle
StationName Azimuth TakeOffAngle

Probability
.
.
.

With TakeOffAngle as 0 down (NED coordinate system). e.g.:

504.7
S0271   231.1   154.7
S0649   42.9    109.7
S0484   21.2    145.4
S0263   256.4   122.7
S0142   197.4   137.6
S0244   229.7   148.1
S0415   75.6    122.8
S0065   187.5   126.1
S0362   85.3    128.2
S0450   307.5   137.7
S0534   355.8   138.2
S0641   14.7    120.2
S0155   123.5   117
S0162   231.8   127.5
S0650   45.9    108.2
S0195   193.8   147.3
S0517   53.7    124.2
S0004   218.4   109.8
S0588   12.9    128.6
S0377   325.5   165.3
S0618   29.4    120.5
S0347   278.9   149.5
S0529   326.1   131.7
S0083   223.7   118.2
S0595   42.6    117.8
S0236   253.6   118.6

502.7
S0271   233.1   152.7
S0649   45.9    101.7
S0484   25.2    141.4
S0263   258.4   120.7
.
.
.

Initialisation of inversion object

Parameters:
  • data (dict/list) -- Dictionary or list of dictionaries containing data for inversion. Can be ignored if a data_file is passed as an argument (for data format, see below).
  • data_file (str) -- Path or list of file paths containing (binary) pickled data dictionaries.
  • location_pdf_file_path (str) -- Path or list of file paths to angle scatter files (for file format, see above - other format extensions can be added using setuptools entry points).
  • algorithm (str) -- ['iterate'] algorithm selector
  • parallel (bool) -- [True] Run the inversion in parallel using multiprocessing
  • n (int) -- [0] Number of workers, default is to use as many as returned by multiprocessing.cpu_count
  • phy_mem (int) -- [8] Estimated physical memory to use (used for determining array sizes, it is likely that more memory will be used, and if so no errors are forced). On python versions <2.7.4 there is a bug (http://bugs.python.org/issue13555) with pickle that limits the total number of samples when running in parallel, so large (>20GB) phy_mem allocations per process are ignored.
  • dc (bool) -- [False] Boolean flag as to run inversion constrained to double-couple or allowed to explore the full moment tensor space.
Keyword Arguments:
 
  • number_stations (int) -- [0] Used for estimating sample sizes in the Monte Carlo random sampling algorithms (Time,Iterate) if set.
  • number_location_samples (int) -- [0] Used for estimating sample sizes in the Monte Carlo random sampling algorithms (Time,Iterate) if set.
  • path (str) -- File path for output. Default is current working dir (interactive) or PBS workdir.
  • fid (str) -- File name root for output, default is to use MTfitOutput or the event UID if set in the data dictionary.
  • inversion_options (list) -- List of data types to be used in the iversion, if not set, the inversion uses all the data types in the data dictionary, irrespective of independence.
  • diagnostic_output (bool) -- [False] Boolean flag to output diagnostic information in MATLAB save file.
  • marginalise_relative (bool) -- [False] Boolean flag to marginalise over location/model uncertainty during relative amplitude inversion.
  • mpi (bool) -- [False] Boolean flag to run using MPI from mpi4py (useful on cluster).
  • multiple_events (bool) -- [False] Boolean flag to run all the events in the inversion in one single joint inversion.
  • relative_amplitude (bool) -- [False] Boolean flag to run multiple_events including relative amplitude inversion.
  • output_format (str) -- ['matlab'] str format style.
  • minimum_number_intersections (int) -- [2] Integer minimum number of station intersections between events in relative amplitude inversion.
  • quality_check (bool) -- [False] Boolean flag/Float value for maximum non-zero percentage check (stops inversion if event quality is poor)
  • recover (bool) -- [False] Tries to recover pre-existing inversions, and does not re-run if an output file exists and is readable.
  • file_sample (bool) -- [False] Saves samples to a mat file (allowing for easier recovery and reduced memory requirements but can slow the inversion down as requires disk writing every non-zero iteration. Only works well for recovery with random sampling methods)
  • results_format (str) -- ['full_pdf'] Data format for the output
  • marginalise_relative -- [False] Boolean flag to marginalise the location uncertainty between absolute and relative data.
  • file_sample -- [False] save samples to file rather than keeping in memory (not necessary in normal use).
  • normalise (bool) -- [True] normalise output Probability (doesn't affect ln_pdf output).
  • convert (bool) -- Convert output moment tensors to Tape parameters, Hudson u&v coordinates and strike-dip-rake triples.
  • discard (bool) -- [False] Probability cut-off for discarding samples Discarding samples - samples less than 1/(discard*n_samples) of the maximum likelihood value are discarded as negligeable. False means no samples are discarded.
  • c_generate (bool) -- [False] Generate samples in the probability calculation when using Cython.
  • generate_cutoff (int) -- Set number of samples to cut-off at when using c_generate (Default is the value of max_samples)
  • relative_loop (bool) -- [False] Loop over non-zero samples when using relative amplitudes.
  • bin_angle_coefficient_samples (int) -- [0] Bin size in degrees when binning angle coefficients (All station angle differences must be within this range for samples to fall in the same bin)
  • no_station_distribution (bool) -- [True] Boolean flag to output station distribution or not.
  • max_samples (int) -- [6000000] Max number of samples when using the iterate algorithm.
  • max_time (int) -- [600] Max time when using the time algorithm.
  • verbosity (int) -- [0] Set verbosity level (0-4) high numbers mean more logging output and verbosity==4 means the debugger will be called on errors.
  • debug (bool) -- [False] Sets debug on or off (True means verbosity set to 4).

Other kwargs are passed to the algorithm - see MTfit.algorithms documentation for help on those.

forward()[source]

Runs event forward model using the arguments when the Inversion object was initialised.

Depending on the algorithm selection uses either random sampling or Markov chain Monte Carlo sampling - for more information on the different algorithms see MTfit.algorithms documentation.