Time-Frequency Signal and System Analysis

In Time Frequency Analysis, 2003

Joint distributions generalize single-variable distributions that measure the energy content of some physical quantity in a signal. Given a quantity a represented by the Hermitian (self-adjoint) operator A, we obtain the density |(IF A s)(a)|2 measuring the "a content" of the signal s simply by squaring the projection of s onto the formal eigenfunctions 1 u a A of A [5]

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080443355500255

Load and Global Response of Ships

J.J. Jensen , in Elsevier Ocean Engineering Series, 2001

3.1.6 Probability Distributions of Several Variables

The probability distribution F(x 1, x 2,…,xn ) of n random variables Xi , i =   1,2,…, n, is defined by

(3.55) F x 1 x 2 x 3 x n = P X 1 x 1 , X 2 x 2 , , X n x n

as a generalisation of Eq.(3.1). For continuous variables, the joint probability density function p(x 1, x 2 ,…,xn ) is given by

(3.56) F x 1 , x 2 , , x n = a 1 x 1 a 2 x 2 a n x n p u 1 , u 2 , , u n d u 1 d u 2 d u n

where ai is the lower boundary on Xi. Provided that F(x 1 , x 2 ,…,xn ) is differentiable for all values of xi, then

(3.57) p x 1 x 2 x n = n F x 1 , x 2 , , x n x 1 x 2 x n

If the individual random variables Xi are statistically independent, then Eqs. (3.55) - (3.57) yield

(3.58) F x 1 , x 2 , , x n = F x 1 x 1 F x 2 x 2 F x n x n

and

(3.59) p x 1 x 2 x n = p x 1 x 1 p x 2 x 2 p x n x n

where F x i x i and p x i x i are denoted the marginal distribution and marginal density functions, respectively. For statistically dependent variables, the marginal density function for Xi becomes

(3.60) p x i x i a 1 b 1 a i 1 b i 1 a i + 1 b i + 1 a n b n p x 1 x 2 x n d x 1 d x i 1 d x i + 1 d x n

where bj is the upper boundary on Xj. The marginal distribution is obtained by integration of Eq.(3.60):

(3.61) F x i x i = a i x i p x i u i d u i

The marginal distribution function F x i x i expresses the probability that a variable Xi is less than or equal to xi, irrespective of the values of all the other variables Xj, j =  1, .., n ; j  i. If, on the contrary, the values of Xj are known, i.e. Xj   = xj , j =   1,2, .., n ; j     i, then the conditional probability distribution for Xi is defined as

(3.62) P X i | X j = x j ; j = 1 , 2 , .. , n ; j i = F x i | x 1 , x 2 , .. , x i 1 , x i + 1 , .. , x n = a i x i p x 1 , x 2 , .. , x i 1 , u i , x i + 1 , .. , x n d u i a i b i p x 1 , x 2 , .. , x i 1 , u i , x i + 1 , .. , x n d u i

where the denominator is the marginal distribution of X 1 , X 2 , .., X i – 1, X i +  1 , ..,Xn. The conditional probability density function is obtained by replacing the numerator on the right-hand side with p(x 1 , x 2 , ..,xn ), i.e. by differentiation of Eq.(3.62) with respect to xi .

For two random variables (X, Y) Eqs.(3.60) - (3.62) yield the following relation between joint, marginal and conditional probability densities:

(3.63) p x y = p x | y p y y

The moments E[G(X 1 , X 2 , .., Xn )] of any combination G(Χ 1 , X 2 , .., Xn ) of the random variables are defined by

(3.64) E G X 1 , X 2 , .. , X n = a 1 b 1 a 2 b 2 .. a n b n G x 1 , x 2 , .. , x n p x 1 , x 2 , .. , x n d x 1 d x 2 .. d x n

The most useful moments are the central moments:

(3.65) ζ m 1 , m 2 , , m n = E X 1 μ 1 m 1 X 2 μ 2 m 2 X n μ n m n

where the mean values are given as

(3.66) μ i = E X i = a 1 b 1 a 2 b 2 .. a n b n x i p x 1 , x 2 , .. , x n d x 1 d x 2 .. d x n = a i b i x i p xi x i d x i

Of special importance is the covariance matrix Σ ¯ ¯ :

(3.67) Σ ¯ ¯ = σ 11 σ 12 σ 1 n . . . . σ n 1 σ n n

with the components

(3.68) σ i j c o v x i x j = E X i μ i X j μ j

The diagonal term σii is seen to be the variance of the variable Xi.

A non-dimensional measure of covariance is the correlation matrix ρ ¯ ¯ :

(3.69) ρ ¯ ¯ = ρ 11 ρ 12 ρ 1 n . . . . ρ n 1 ρ n n

where each of the correlation coefficients is defined as

(3.70) ρ i j = σ i j s i s j

by use of the standard deviations

(3.71) s i = E X i μ i 2 = σ i i

as normalisation factors. If the variables Xi and Xj are independent, then

(3.72) σ i j = E X i μ i E X j μ j = 0

due to Eq.(3.66). Thus, statistically independent pairs of random variables have zero off-diagonal covariance and correlation coefficients. The reverse is not always true but holds for instance for the multivariate normal distribution defined by the joint probability density function:

(3.73) p x 1 x 2 x n = 1 2 π n Σ ¯ ¯ exp 1 2 x ¯ μ ¯ T Σ ¯ ¯ 1 x ¯ μ ¯

where Σ ¯ ¯ is given by Eq.(3.67). | | denotes the matrix determinant and the subscript bar denotes a vector.

This joint distribution clearly becomes the product of the density functions of each of the variables xi if σij =  0 for i     j. Thus, in this case, zero correlation also implies statistical independence.

Example 3.1.3

Consider two variables (X 1 , X 2) with the joint probability density function

p x 1 x 2 = 6 7 x 1 + x 2 + 1 7 ; x 1 x 2 0 1

It is seen that p(x 1 , x 2) satisfies

0 1 0 1 p x 1 x 2 d x 1 d x 2 = 1

as well as

p x 1 x 2 0 f o r x 1 x 2 0 1

as required for a probability density function. Furthermore, the mean values are found to be

μ i = 0 1 0 1 x i p x 1 x 2 d x 1 d x 2 = 9 14 ; i = 1 , 2

and the covariance to be

c o v x 1 x 2 = E X 1 μ 1 X 2 μ 2 = 0 1 0 1 x 1 μ 1 x 2 μ 2 p x 1 , x 2 d x 1 d x 2 = 0

The variables X 1 and X 2 are thus uncorrelated. By application of Eq.(3.60) the marginal probability densities become

p x 1 x 1 = 0 1 p x 1 x 2 d x 2 = 6 7 x 1 + 4 7 p x 2 x 2 = 0 1 p x 1 x 2 d x 1 = 6 7 x 2 + 4 7

As

p x 1 , x 2 p x 1 x 1 p x 2 x 2

the two variables are, however, not statistically independent.

The correlation coefficients ρij , Eq.(3.70), are bounded:

(3.74) 1 ρ i j 1

as

E X i μ i X j μ j 2 E X i μ i 2 E X j μ j 2

due to Schwarz' inequality*.

A correlation coefficient equal to zero signifies uncorrelated variables. For ρij  =  ±   1, it follows from Eq.(3.70) that

E X i μ i X j μ j = ± s i s j

This can only be satisfied for

X i μ i s i = ± X j μ j s j

which is a complete correlation between Xi and Xj. The value of ρij therefore serves as a convenient measure of the correlation between two random variables.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S1571995201800052

Biomethane for transport applications

Mattias Svensson , in The Biogas Handbook, 2013

18.2.3 Synergies of joint distribution and utilization of biomethane and natural gas

The synergies of joint distribution of biomethane and natural gas are quite obvious. It is the same molecule, so allowing biomethane to utilize the natural gas transport infrastructure decreases the total costs and makes it possible to reach full utilization of the energy of the biogas potential ( Thamsiriroj et al. 2011). At the same time, the renewability of the energy gas infrastructure is increased.

When the natural gas grid distribution system is not an option, several synergies are at hand when allowing for joint utilization of natural gas and biomethane for automotive fuel purposes. The irrefutable environmental benefits of biomethane make it the preferred choice at all times, but in an emerging market situation its production is too insecure and small to adapt smoothly. Here, natural gas can not only initiate and accelerate market penetration during the build-up of the biomethane production capacity and hence facilitate the unavoidable hen-and-egg situation, but can also serve as a backup and secure supply source in the event of production failures or sudden growth in demand.

The Swedish NGV market is a showcase for this type of synergy. Gas grid coverage is limited to the west coast of Sweden, making it necessary to utilize biogas as the main source of gas in the rest of the country. Natural gas in compressed and liquefied form is used as backup to sustain the biomethane market development. At times of accelerated market expansion, the use of natural gas may increase for a time, but customer preferences motivate the gas suppliers to strive for a growing share of renewable methane, even in the parts of Sweden with natural gas grid access. Over time, the volumes of biomethane on the Swedish NGV market have continually increased, in 2011 reaching 60% on an energy basis in a total market of more than 1200 GWh, supplying 38 600 vehicles, of which a significant portion are buses.

Road transport of biomethane should be avoided for larger volumes because grid transport is so much better in terms of both costs and energy expenditure. This is addressed by investing in local gas grids. The expansion and connection of local grids to the national grid is a natural progression in an expanding biomethane market, once again showing the importance of natural gas and biomethane working together in the same market.

It can be envisaged that LBG will change the market conditions in a very positive manner for countries with conditions similar to Sweden. The three types of distribution – by grid, LNG by road and CNG by road – will coexist, fulfilling different needs of the market.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978085709498850018X

Modeling Financial Data with Stable Distributions

John P. Nolan , in Handbook of Heavy Tailed Distributions in Finance, 2003

7 Multivariate application

Here we will examine the joint distribution of the German Mark and the Japanese Yen. The data set is the one described above in the univariate example. We are interested in both assessing whether joint distribution is bivariate stable and in estimating the fit.

Figure 9 shows a sequence of smoothed density, q-q plot and variance stabilized p-p plot for projections in 8 different directions: π / 2 , π / 3 , π / 4 , π / 6 , 0 , - π / 6 , - π / 4 , - π / 3 . (We restrict to the right half plane because projections in the left half plane are reflections of those in the right half plane.) These projections are similar to Figure 2, in fact the fifth row of Figure 9 is exactly the same as Figure 2. Except on the extreme tails, the stable fit does a goodjob of describing the data.

Fig. 9. Projection diagnostics for the German Mark and Japanese Yen exchange rates.

The projection functions α ( t ) , β ( t ) , γ ( t ) , and δ ( t ) were estimated and used to compute an estimate of the spectral measure using the projection method. The results are shown in Figure 10. It shows a discrete estimate of the spectral measure (with m = 1 0 0 evenly spaced point masses) in polar form, a cumulative plot of the spectral measure in rectangular form, and then four plots for the parameter estimates ( α ( t ) , β ( t ) , γ ( t ) , δ ( t ) ) . Also on the α ( t ) plot is a horizontal line showing the average value of all the estimated indices which is taken as the estimate of the common α that should come from jointly stable distribution. The plots of β ( t ) and γ ( t ) also show the skewness and scale functions computed from the estimated spectral measure substituted into (9). These curves, which are based on ajoint estimate of the spectral measure, are indistinguishable from the direct, separate estimates of the directional parameters.

Fig. 10. Estimation results for the German Mark and Japanese Yen exchange rates.

The fitted spectral measure was used to plot the fitted bivariate density shown in Figure 11. The spread of the spectral measure is spiky, and masks a pattem that is more obvious in the density surface: the approximate elliptical contours of the fitted density. This suggests modeling the data by a sub-Gaussian stable distribution, a topic discussed in the next section.

Fig. 11. Estimated density surface and level curves for a bivariate stable fit to the German Mark and Japanese Yen exchange rates.

Some comments on these plots. The polar plots of the spectral measure show a unit circle and lines connecting the points ( θ j , r j ) , where θ j = 2 π ( j - 1 ) / m and r j = 1 + ( λ j / λ max ) , where λ max = max λ j . The polar plots are spiky, because we are estimating a discrete object. What should be looked at is the overall spread of mass, not specific spikes in the plot. In cases where the spectral measure is really smooth, it may be appropriate to smooth these plots out to better show it's true nature. In cases where the measure is discrete, i.e., the independent case, then one wants to emphasize the spikes. So there is no satisfactory general solution and wejust plot the raw data.

Finally, most graphing programs will set vertical scale so that the graph fills the graph. This emphasizes minor Iiuctuations in the data that are not of practical significance. In the graphs below, the vertical scales for the parameter functions α ( t ) , β ( t ) , γ ( t ) are respectively [0, 2], [-1, 1], and [ 0 , 1 . 2 × max γ ( t ) ] . These bounds show how the functions vary over their possible range. For δ ( t ) , we used the bounds [ - 1 . 2 × max | δ ( t ) | , 1 . 2 × max | δ ( t ) | ] , which visually exaggerates the changes in δ ( t ) . A scale that depends on max γ ( t ) may be more appropriate.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444508966500054

THEORY OF MULTIPLICITY IN NUCLEAR SAFEGUARDS

IMRE PÁZSIT , LÉNÁRD PÁL , in Neutron Fluctuations, 2008

11.2 BASIC EQUATIONS

Since the number distribution of neutrons and gamma quanta generated in the sample will be derived by starting with one initial event, a backward equation formalism will be used. Then, as usual, one has to proceed in two steps, first by deriving an equation for the distribution of neutrons induced by one initial particle, and then another equation connecting the source-induced and the single-particle-induced distributions.

In order to investigate the joint distribution of the random variables v and μ , define the probability

(11.29) P { v = n 1 , μ = n 2 | r } = w ( n 1 , n 2 | 1 )

that the numbers of neutrons and gamma quanta emitted from a sample are exactly n 1 and n 2, respectively, provided that the cascade was started by one neutron. 4 One can write that

(11.30) w ( n 1 , n 2 | 1 ) = ( 1 p ) δ n 1 , 1 δ n 2 , 0 + p k = 0 p r ( k ) = 0 f r ( ) n 11 + + n 1 k = n 1 n 21 + + n 2 k = n 2 i = 1 k w ( n 1 i , n 2 i | 1 ) .

Introducing the generating function

(11.31) u ( z 1 , z 2 | 1 ) u ( z 1 , z 2 ) = n 1 = 0 n 2 = 0 w ( n 1 , n 2 | 1 ) z 1 n 1 z 2 n 2 ,

one obtains

(11.32) u ( z 1 , z 2 ) = ( 1 p ) z 1 + p r r ( z 2 ) q r [ u ( z 1 , z 2 ) ] .

Let

(11.33) P { v = n 1 , μ = n 2 | s } = W ( n 1 , n 2 | s )

be the probability that the numbers of neutrons and gamma quanta emitted from a sample are exactly n 1 and n 2, respectively, provided that the cascade was started by one source event s . Since

(11.34) w ( n 1 , n 2 | s ) = = 0 f s ( ) k = 0 p s ( k ) n 11 + + n 1 k = n 1 n 21 + + n 2 k = n 2 i = 1 k w ( n 1 i , n 2 i | 1 ) ,

it can be immediately shown that the generating function

(11.35) U ( z 1 , z 2 | s ) = n 1 = 0 n 2 = 0 W ( n 1 , n 2 | s ) z 1 n 1 z 2 n 2

satisfies the equation

(11.36) U ( z 1 , z 2 | s ) U ( z 1 , z 2 ) = r s ( z 2 ) q s [ u ( z 1 , z 2 ) ] .

From equations (11.32) and (11.36) all the joint and individual moments and probability distributions of the numbers of the generated neutrons and gamma photons can be derived. The individual distributions for neutrons and gamma quanta are contained as special cases that can be obtained by taking z 2 = 1 and z 1 = 1, respectively. They will be first described, before turning to the joint moments and distributions.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780080450643500144

11th International Symposium on Process Systems Engineering

Zhonggai Zhao , ... Fei Liu , in Computer Aided Chemical Engineering, 2012

3.2 Smoothing for Joint distribution

If t   <   T , the joint distribution is [R. B. Gopaluni, 2008]

(8) p x i , t , x i , t + 1 | Y = p x i , t + 1 | Y p x i , t + 1 | x i , t p x i , t | y i , 1 : t , Y 1 : i p x i , t + 1 | x i , t p x i , t | y i , 1 : t , Y 1 : i d x i , t

Then the smoother weights are ω i , t , t + 1 n = η i , t n n = 1 N η i , t n , where η i , t n = ω i , t n ω i , t + 1 | IT n p x i , t + 1 n | x i , t n k = 1 N ω i , t k p x i , t + 1 n | x i , t k .

If t = T, the joint distribution involves the initial states as

(9) p x i , 1 , x i , T , x i + 1 , 1 | Y = p x i + 1 , 1 | Y p x i + 1 , 1 | x i , 1 , x i , T p x i , T | Y 1 : i p x i , 1 | Y 1 : i p x i + 1 , 1 | x i , 1 , x i , T p x i , T | Y 1 : i p x i , 1 | Y 1 : i d x i , 1 d x i , T = n = 1 N ω i , T , 1 n δ x i + 1 , 1 - x i + 1 , 1 n δ x i , 1 - x i , 1 n δ x i , T - x i , T n

The smoother weights are ω i , T , 1 n = η i , T n n = 1 N η i , T n where η i , t n = ω i , T n ω i + 1 | iT n ω i + 1 , 1 | IT n p x i + 1 , 1 n | x i , T n , x i , 1 n k = 1 N ω i , 1 | iT k ω i , T k p x i + 1 , 1 n | x i , T k , x i , 1 k .

To summarize, the Q function can be expressed as

10 Q θ i θ = k = 1 N ω 1 , 1 IT k log p θ x 1 , 1 k + i = 2 I j = 2 T k = 1 N ω i , j - 1 , j k log p θ x i , j k x i , j - 1 k + i = 2 I k = 1 N ω i - 1 , T , 1 k log p θ x i , 1 k x i - 1 , T k , x i - 1 , 1 k + j = 2 T k = 1 N ω 1 , j - 1 , j k log p θ x 1 , j k x 1 , j - 1 k + i = 1 I j = 1 T k = 1 N ω i , j IT k log p θ y i , j x i , j k

If an analytical solution of θ to maximize (10) is intractable, numerical methods can be employed to obtain the parameter estimation. Computations of (10) and θ are iterated until the convergence of θ .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444595065500183

Time varying signals

Grant E Hearn , Andrew V Metcalfe , in Spectral Analysis in Engineering, 1995

3.5.1 Stationarity

A stochastic process is strictly stationary if the joint distribution of

( X t l , , X t n )

is the same as the joint distribution of

( X t 1 + k , , X t n + k )

for any set of times {t 1,…, tn } and any value of k. As we are limiting our attention to the first two moments we will adopt a rather weaker notion of second-order stationarity. If we are prepared to assume a multivariate normal distribution, then the distributions are equivalent. A stochastic process is second-order stationary if

(3.4) E ( X t ) = μ , a constant for all t

and γ(t 1, t 2) depends only on the difference between t 2 and t 1, known as the lag and usually denoted by k. That is

(3.5) cov ( X t , X t + k ) γ ( k ) , where k = t 2 t 1

and the acf becomes

(3.6) ρ ( k ) = γ ( k ) / γ ( 0 )

It follows from the definition of ρ(k) that ρ(0) is 1 and ρ(-k) is equal to ρ(k). Also, as ρ(k) is a special case of a correlation coefficient it is less than or equal to 1 in absolute value. A plot of ρ(k) against k is called a correlogram. In summary, a random process is second-order stationary if its mean and variance do not change with time and if the covariance depends only on lag and not on absolute time. If a random process includes a trend or seasonal effects, or a change in variance, it is non-stationary. A plot of the data is an important first step in looking for non-stationarity in time series. Figures 3.5(a) and (b) show realizations from non-stationary random processes. The first includes an obvious trend, the second an obvious increase in variance with time.

Fig. 3.5. (a) An obvious trend with time; (b) increase in variance with time

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780340631713500051

Application of Bayes Network analysis to RWGD siting

Thea Hincks , ... Steve Sparks , in Geological Repository Systems for Safe Disposal of Spent Nuclear Fuels and Radioactive Waste (Second Edition), 2017

19.2 Bayes Networks: introduction and methods

The basic concept behind the treatment of uncertainty in Bayes Networks is conditional probability (Jensen, 2001). A BN is a directed acyclic graph, comprising of a set of variables (nodes) representing states of the system, together with a set of directed links (arcs) representing conditional dependencies between the nodes (Jensen, 2001). Bayes networks form the basis of many expert systems, from medical diagnosis and decision making to bioinformatics, and more recently, widespread application in hazard and risk assessment (Spiegelhalter et al., 1993; Ale et al., 2008; Aspinall and Woo, 2014).

For the simple case where nodes can be assigned discrete states, node relationships are described by conditional probability tables (CPTs, e.g., Table 19.1). Continuous variables are represented by joint probability distributions. Static networks describe the state of a system with no history (Fig. 19.1). Dynamic BNs (e.g., the special case Hidden Markov Model, Fig. 19.2) describe systems where the state of the system at any time is dependent on any number of past states.

Table 19.1. Example conditional probability table

P(Y|X)
Y   =   0 Y   =   1
X   =   0 0.8 0.2
X   =   1 0.96 0.04

Figure 19.1. A simple Bayes Network. For a Bayes Network B   =   {X,Y,U}, let X represent the set of unobserved or hidden states (nodes shaded gray), Y the set of observable states (the evidence) and U the set of directed links between nodes. Arrows indicate direction of causality or influence.

Figure 19.2. The Hidden Markov Model. A simple dynamic model used to infer the state of a hidden variable X (shaded gray) given an observation on Y (Murphy, 2002). This is a first-order dynamic model—the state of X at time t is influenced by its state at time t − 1. Higher order processes may be tied over several time slices. The prior distribution P(X0) and transition model P(Xt|Xt−1) can be learned from data or determined by expert elicitation or physical models.

For data-rich applications, network parameters and structure can be estimated entirely from data, using learning algorithms (e.g., Murphy, 2002; Hanea et al., 2010; Nickel et al., 2015). Where data are limited, or it is necessary to model states or scenarios outside the set of observations, expert judgment can be applied to develop a more comprehensive model (Morales et al., 2008; Aspinall and Cooke, 2013). When the model is fully parameterized, inference can be performed in any direction, e.g., to estimate the probability distribution of an unobservable node, or forecast future states.

Methods of analyzing information in the BN, such as entropy (a measure of unpredictability) and "mutual information" (MI) (see below, and results in Section 19.4), have considerable value in communicating model uncertainty, sensitivity to different assumptions, and the strength of relationships between nodes. High entropy can result from high natural variability (which cannot be reduced); however it could also be an indication of an inadequate model, resulting from a poor understanding of the processes involved, or an inaccurate representation of the key states and interactions. MI is a measure of conditional dependency between two variables and is an elegant way of expressing dependencies between elements in a system that may not be immediately obvious, especially in complex models with numerous links.

19.2.1 The marginal distribution

For discrete random variables X,Y with joint distribution function P(x,y), the marginal distribution of X, P(x), is the sum of the joint distribution of X and Y, P(x,y), over all values of Y:

(19.1) P ( x ) = y P ( x , y ) = y P ( x | y ) P ( y )

19.2.2 Entropy

Entropy is a measure of unpredictability in a system (Bedford and Cooke, 2001). Zero entropy means the state is known exactly, with a probability of 1. Maximum entropy corresponds to all states having equal probability (maximum unpredictability). For a node (variable) X, with n discrete states, the entropy H(X) is defined as follows:

(19.2) H ( X ) = i = 1 n P ( x i ) log 2 ( 1 P ( x i ) )

where P(x i ) is the probability of X being in state x i .

The maximum entropy for any given node will therefore depend on the number of possible states:

(19.3) H max ( X ) = log 2 n

To compare the entropy of nodes with different numbers of states, it can be useful to compute the "normalized entropy" H 0(X):

(19.4) H 0 ( X ) = H ( X ) H max ( X ) = H ( X ) log 2 n

19.2.3 Mutual information

The MI of nodes X and Y, MI(X,Y), is the reduction of uncertainty in Y by knowing X. MI can be computed for any pair of nodes, regardless of whether they are directly connected. MI is used in algorithms for estimating network structure from data, as high MI between a pair of nodes indicates a strong conditional dependency (in some direction). Low MI, typically defined by some cut-off point, implies conditional independence (Kane et al., 2003).

As the BN presented here has been developed using expert opinion (rather than observational data), computing the MI gives a measure of the perceived strength of connection between nodes. This can be used to assess the extent to which individual states and processes are expected to impact on the various risk factors.

For two variables X and Y with discrete states, MI(X,Y) is defined as:

(19.5) MI ( X , Y ) = H ( Y ) H ( Y | X ) = H ( X ) H ( X | Y )

where H(X) denotes the entropy of node X (Ebert-Uphoff, 2006). The function is symmetric in X and Y and does not indicate the direction of influence (does not distinguish X as a parent of Y or vice versa).

MI percentage is the percentage reduction of uncertainty in target node Y due to knowledge of X and is not symmetric in X and Y:

(19.6) MI % ( X , Y ) = H ( Y ) H ( Y | X ) H ( Y ) × 100

To conclude, BNs provide a clear and computationally efficient way to model complex causal relationships between observable and unobservable states and enable diverse sources of information to be jointly interpreted. By representing the system components and interactions graphically, it can be easier to identify conceptual problems or logical inconsistencies in the model and aid discussion and communication of key scientific ideas between both experts and nontechnical stakeholders. This is particularly important where model development requires input from experts from a range of disciplines. Subcomponents of the system can also be developed separately, for ease of discussion and integration of expert judgment, before being integrated into a full BN model.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780081006429000190

Foundations of Distributed Source Coding

Krishnan Eswaran , Michael Gastpar , in Distributed Source Coding, 2009

Definition A.1

Let X, Y be discrete random variables with joint distribution p(x, y). Then the entropy H(X) of X is a real number given by the expression

(A.53) H ( X ) = - E [ log p ( X ) ] = - x p ( x ) log p ( x ) ,

the joint entropy H(X, Y) of X and Y is a real number given by the expression

(A.54) H ( X , Y ) = - E [ log p ( X , Y ) ] = - x , y p ( x , y ) log p ( x , y ) ,

and the conditional entropy H(X|Y) of X given Y is a real number given by the expression

(A.55) H ( X | Y ) = - E [ log p ( X | Y ) ] = H ( X , Y ) - H ( Y ) .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123744852000056

Validation of internal rating systems and PD estimates*

Dirk Tasche , in The Analytics of Risk Model Validation, 2008

3.5 Equivalence of the both descriptions

So far, we have seen two descriptions of the joint distribution of the score variable and the state variable, which are quite different at first glance. However, thanks to Bayes' formula both descriptions are actually equivalent.

Suppose, first that a description of the joint distribution of score and state by the total probability of default p and the two conditional densities fD and fN according to Equations 3.1a and 3.1b is known. Then, the unconditional score density f can be expressed as

(3.3a) f s = p f D s + 1 p f N s ,

and the conditional PD given that the score variable takes on the value s can be written as

(3.3b) P D | S = s = p f D s f s .

Assume now that the unconditional score density f in the sense of Equation 3.2b and the function representing the conditional PDs given the scores in the sense of Equation 3.2a are known. Then, the total PD p can be calculated as an integral of the unconditional density f and the conditional PD as

(3.4a) p = P D | S = s f s d s ,

and both the conditional densities of the score variable can be obtained via

(3.4b) f D s = P D | S = s f s / p and f N s = P N | S = s f s / 1 p .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780750681582500147