Methods of factor analysis. Factor analysis of profit Why factor analysis is needed

FACTOR ANALYSIS

Idea factor analysis

When studying complex objects, phenomena, systems, the factors that determine the properties of these objects very often cannot be measured directly, and sometimes even their number and meaning are unknown. But other quantities may be available for measurement, depending in one way or another on the factors of interest to us. Moreover, when the influence of an unknown factor of interest to us is manifested in several measured signs or properties of an object, these signs can show a close relationship with each other and the total number of factors can be much less than the number of measured variables.

Factor analysis methods are used to identify factors that determine the measured characteristics of objects.

An example of the application of factor analysis is the study of personality traits based on psychological tests. Personality traits cannot be directly measured. They can only be judged by a person’s behavior or the nature of their answers to questions. To explain the results of experiments, they are subjected to factor analysis, which allows us to identify those personal properties that influence the behavior of an individual.
The basis of various methods of factor analysis is the following hypothesis: the observed or measured parameters are only indirect characteristics of the object being studied; in reality, there are internal (hidden, latent, not directly observable) parameters and properties, the number of which is small and which determine the values ​​of the observed parameters. These internal parameters are usually called factors.

The purpose of factor analysis is to concentrate the initial information, expressing a large number of characteristics under consideration through a smaller number of more capacious internal characteristics phenomena that, however, cannot be directly measured

It has been established that the identification and subsequent monitoring of the level of common factors makes it possible to detect pre-failure conditions of an object at a very high level. early stages development of the defect. Factor analysis allows you to monitor the stability of correlations between individual parameters. It is the correlations between parameters, as well as between parameters and general factors, that contain the main diagnostic information about processes. The use of the tools of the Statistica package when performing factor analysis eliminates the need to use additional computing tools and makes the analysis visual and understandable for the user.

The results of factor analysis will be successful if it is possible to interpret the identified factors based on the meaning of the indicators characterizing these factors. This stage of work is very responsible; it requires a clear understanding of the substantive meaning of the indicators that are used for analysis and on the basis of which factors are identified. Therefore, when carefully selecting indicators for factor analysis in advance, one should be guided by their meaning, and not by the desire to include as many of them as possible in the analysis.

The essence of factor analysis

Let us present several basic provisions of factor analysis. Let for the matrix X of the measured object parameters there is a covariance (correlation) matrix C, Where r– number of parameters, n– number of observations. By linear transformation X=QY+U you can reduce the dimension of the original factor space X to the level Y, while r"<<r. This corresponds to the transformation of a point characterizing the state of an object into j-dimensional space, into a new dimensional space with a lower dimension r". Obviously, the geometric proximity of two or many points in the new factor space means the stability of the object’s state.

Matrix Y contains unobservable factors, which are essentially hyperparameters that characterize the most general properties of the analyzed object. Common factors are most often chosen to be statistically independent, which facilitates their physical interpretation. Vector of observed features X the consequences of changing these hyperparameters make sense.

Matrix U consists of residual factors, which mainly include measurement errors of characteristics x(i). Rectangular matrix Q contains factor loadings that determine the linear relationship between features and hyperparameters.
Factor loadings are the values ​​of the correlation coefficients of each of the original characteristics with each of the identified factors. The closer the connection of a given characteristic with the factor under consideration, the higher the value of the factor loading. A positive sign of a factor loading indicates a direct (and a negative sign – an inverse) relationship between a given characteristic and a factor.

Thus, data on factor loadings make it possible to formulate conclusions about the set of initial features that reflect a particular factor, and about the relative weight of an individual feature in the structure of each factor.

The factor analysis model is similar to the multivariate regression and variance analysis models. The fundamental difference between the factor analysis model is that the vector Y is unobservable factors, while in regression analysis it is the recorded parameters. On the right side of equation (8.1), the unknowns are the matrix of factor loadings Q and the matrix of values ​​of common factors Y.

To find the matrix of factor loadings, use the equation QQ t = S–V, where Q t is the transposed matrix Q, V is the covariance matrix of residual factors U, i.e. . The equation is solved by iterations by specifying some zero approximation of the covariance matrix V(0). After finding the matrix of factor loadings Q, the common factors (hyperparameters) are calculated using the equation
Y=(Q t V -1)Q -1 Q t V -1 X

The Statistica statistical analysis package allows you to interactively calculate a matrix of factor loadings, as well as the values ​​of several predefined main factors, most often two - based on the first two main components of the original parameter matrix.

Factor analysis in the Statistica system

Let's consider the sequence of factor analysis using the example of processing the results of a questionnaire survey of enterprise employees. It is required to identify the main factors that determine the quality of working life.

At the first stage, it is necessary to select variables for factor analysis. Using correlation analysis, the researcher tries to identify the relationship between the characteristics being studied, which, in turn, gives him the opportunity to identify a complete and non-redundant set of characteristics by combining highly correlated characteristics.

If factor analysis is carried out on all variables, the results may not be entirely objective, since some variables are determined by other data and cannot be regulated by the employees of the organization in question.

In order to understand which indicators should be excluded, let’s build a matrix of correlation coefficients using the available data in Statistica: Statistics/ Basic Statistics/ Correlation Matrices/ Ok. In the start window of this procedure Product-Moment and Partial Correlations (Fig. 4.3), the One variable list button is used to calculate the square matrix. Select all variables (select all), Ok, Summary. We obtain the correlation matrix.

If the correlation coefficient varies from 0.7 to 1, then this means a strong correlation of indicators. In this case, one variable with a strong correlation can be eliminated. Conversely, if the correlation coefficient is small, you can eliminate the variable due to the fact that it will not add anything to the total. In our case, there is no strong correlation between any variables, and we will conduct factor analysis for the full set of variables.

To run factor analysis, you need to call the Statistics/Multivariate Exploratory Techniques/Factor Analysis module. The Factor Analysis module window will appear on the screen.



For analysis, we select all the variables of the spreadsheet; Variables: select all, Ok. The Input file line indicates Raw Data. There are two types of source data possible in the module - Raw Data and Correlation Matrix - correlation matrix.

The MD deletion section specifies how missing values ​​are handled:
* Casewise – a way to exclude missing values ​​(default);
* Pairwise – pairwise method of eliminating missing values;
* Mean substitution – substitution of the mean instead of missing values.
The Casewise method is to ignore all rows in a spreadsheet containing data that have at least one missing value. This applies to all variables. The Pairwise method ignores missing values ​​not for all variables, but only for the selected pair.

Let's choose a way to handle missing values ​​Casewise.

Statistica will process missing values ​​in the manner specified, calculate a correlation matrix, and offer several factor analysis methods to choose from.

After clicking the Ok button, the Define Method of Factor Extraction window appears.

The top part of the window is informational. This reports that missing values ​​are handled using the Casewise method. 17 observations were processed and 17 observations were accepted for further calculations. The correlation matrix was calculated for 7 variables. The lower part of the window contains 3 tabs: Quick, Advanced, Descriptives.

There are two buttons in the Descriptives tab:
1- view correlations, means and standard deviations;
2- build multiple regression.

By clicking on the first button, you can view averages and standard deviations, correlations, covariances, and build various graphs and histograms.

In the Advanced tab, on the left side, select the Extraction method of factor analysis: Principal components. On the right side, select the maximum number of factors (2). Either the maximum number of factors (Max no of factors) or the minimum eigenvalue is specified: 1 (eigenvalue).

Click Ok, and Statistica will quickly perform the calculations. The Factor Analysis Results window appears on the screen. As stated earlier, the results of factor analysis are expressed by a set of factor loadings. Therefore, further we will work with the Loadings tab.

The upper part of the window is informational:
Number of variables (number of analyzed variables): 7;
Method (factor selection method): Principal components;
Log (10) determinant of correlation matrix: –1.6248;
Number of factors extracted: 2;
Eigenvalues ​​(eigenvalues): 3.39786 and 1.19130.
At the bottom of the window there are functional buttons that allow you to comprehensively view the analysis results, numerically and graphically.
Factor rotation – rotation of factors; in this drop-down window you can select different axis rotations. By rotating the coordinate system, a set of solutions can be obtained from which an interpretable solution must be selected.

There are various methods for rotating space coordinates. The Statistica package offers eight such methods, presented in the factor analysis module. So, for example, the varimax method corresponds to a coordinate transformation: a rotation that maximizes variance. In the varimax method, a simplified description of the columns of the factor matrix is ​​obtained, reducing all values ​​to 1 or 0. In this case, the dispersion of the squared factor loadings is considered. The factor matrix obtained using the varimax rotation method is more invariant with respect to the choice of different sets of variables.

Quartimax rotation aims at a similar simplification only with respect to the rows of the factor matrix. Is Equimax in between? When rotating factors using this method, an attempt is made to simplify both columns and rows. The considered rotation methods refer to orthogonal rotations, i.e. the result is uncorrelated factors. The direct oblimin and promax rotation methods refer to oblique rotations, which result in factors that are correlated with each other. The term?normalized? in the names of the methods indicates that factor loadings are normalized, that is, divided by the square root of the corresponding variance.

Of all the proposed methods, we will first look at the result of the analysis without rotating the coordinate system - Unrotated. If the result obtained turns out to be interpretable and suits us, then we can stop there. If not, you can rotate the axes and look at other solutions.

Click on the "Factor Loading" button and look at the factor loadings numerically.



Let us recall that factor loadings are the values ​​of the correlation coefficients of each variable with each of the identified factors.

A factor loading value greater than 0.7 indicates that this characteristic or variable is closely related to the factor in question. The closer the connection of a given characteristic with the factor under consideration, the higher the value of the factor loading. A positive sign of a factor loading indicates a direct (and a negative sign? an inverse) relationship between a given characteristic and a factor.
So, from the table of factor loadings, two factors were identified. The first defines OSB - a sense of social well-being. The remaining variables are determined by the second factor.

In line Expl. Var (Fig. 8.5) shows the variance attributable to one or another factor. In line Prp. Totl shows the proportion of variance accounted for by the first and second factors. Therefore, the first factor accounts for 48.5% of the total variance, and the second factor accounts for 17.0% of the total variance, the rest is accounted for by other unaccounted factors. As a result, the two identified factors explain 65.5% of the total variance.



Here we also see two groups of factors - OCB and the rest of the many variables, from which JSR stands out - the desire to change jobs. Apparently, it makes sense to explore this desire more thoroughly by collecting additional data.

Selection and clarification of the number of factors

Once you know how much variance each factor contributed, you can return to the question of how many factors should be retained. By its nature, this decision is arbitrary. But there are some commonly used recommendations, and in practice, following them gives the best results.

The number of common factors (hyperparameters) is determined by calculating the eigenvalues ​​(Fig. 8.7) of the X matrix in the factor analysis module. To do this, in the Explained variance tab (Fig. 8.4), you need to click the Scree plot button.


The maximum number of common factors can be equal to the number of eigenvalues ​​of the parameter matrix. But as the number of factors increases, the difficulties of their physical interpretation increase significantly.

First, only factors with eigenvalues ​​greater than 1 can be selected. Essentially, this means that if a factor does not contribute variance equivalent to at least the variance of one variable, then it is omitted. This criterion is the most widely used. In the example above, based on this criterion, only 2 factors (two principal components) should be retained.

You can find a place on the graph where the decrease in eigenvalues ​​from left to right slows down as much as possible. It is assumed that to the right of this point there is only a "factorial scree". In accordance with this criterion, you can leave 2 or 3 factors in the example.
From Fig. it can be seen that the third factor slightly increases the share of the total variance.

Factor analysis of parameters makes it possible to identify at an early stage a violation of the work process (the occurrence of a defect) in various objects, which often cannot be noticed by direct observation of the parameters. This is explained by the fact that a violation of correlations between parameters occurs much earlier than a change in one parameter. This distortion of correlations allows timely detection of factor analysis of parameters. To do this, it is enough to have arrays of registered parameters.

General recommendations can be given for the use of factor analysis, regardless of the subject area.
* Each factor must have at least two measured parameters.
* The number of parameter measurements must be greater than the number of variables.
* The number of factors must be justified based on the physical interpretation of the process.
* You should always ensure that the number of factors is much less than the number of variables.

The Kaiser criterion sometimes retains too many factors, while the scree criterion sometimes retains too few factors. However, both criteria are quite good under normal conditions, when there are a relatively small number of factors and many variables. In practice, the more important question is when the resulting solution can be interpreted. Therefore, it is common to examine several solutions with more or fewer factors, and then select the one that makes the most sense.

The space of initial features should be presented in homogeneous measurement scales, since this allows the use of correlation matrices in calculations. Otherwise, the problem of “weights” of various parameters arises, which leads to the need to use covariance matrices when calculating. This may result in an additional problem of repeatability of the results of factor analysis when the number of characteristics changes. It should be noted that this problem is simply solved in the Statistica package by moving to a standardized form of representing parameters. In this case, all parameters become equivalent in terms of the degree of their connection with the processes in the object of study.

Ill-conditioned matrices

If there are redundant variables in the source data set and they have not been eliminated by correlation analysis, then the inverse matrix (8.3) cannot be calculated. For example, if a variable is the sum of two other variables selected for this analysis, then the correlation matrix for that set of variables cannot be inverted, and factor analysis fundamentally cannot be performed. In practice, this occurs when one tries to apply factor analysis to many highly dependent variables, as sometimes happens, for example, in the processing of questionnaires. It is then possible to artificially lower all correlations in the matrix by adding a small constant to the diagonal elements of the matrix, and then standardize it. This procedure usually results in a matrix that can be inverted and is therefore applicable to factor analysis. Moreover, this procedure does not affect the set of factors, but the estimates are less accurate.

Factor and regression modeling of systems with variable states

A variable state system (VSS) is a system whose response depends not only on the input action, but also on a generalized time-constant parameter that determines the state. Variable amplifier or attenuator? This is an example of the simplest SPS, in which the transmission coefficient can change discretely or smoothly according to some law. The study of SPS is usually carried out for linearized models in which the transient process associated with a change in the state parameter is considered completed.

Attenuators made on the basis of L-, T- and U-shaped connections of diodes connected in series and parallel are most widespread. The resistance of the diodes under the influence of the control current can vary over a wide range, which makes it possible to change the frequency response and attenuation in the path. The independence of the phase shift when controlling attenuation in such attenuators is achieved using reactive circuits included in the basic structure. It is obvious that with different ratios of resistance of parallel and series diodes, the same level of introduced attenuation can be obtained. But the change in phase shift will be different.

We are exploring the possibility of simplifying the automated design of attenuators, eliminating double optimization of corrective circuits and parameters of controlled elements. As the SPS under study, we will use an electrically controlled attenuator, the equivalent circuit of which is shown in Fig. 8.8. The minimum level of attenuation is ensured in the case of low element resistance Rs and high element resistance Rp. As element resistance Rs increases and element resistance Rp decreases, the introduced attenuation increases.

The dependences of the change in phase shift on frequency and attenuation for the circuit without correction and with correction are shown in Fig. 8.9 and 8.10 respectively. In the corrected attenuator, in the attenuation range of 1.3-7.7 dB and the frequency band of 0.01-4.0 GHz, a change in the phase shift of no more than 0.2° was achieved. In an attenuator without correction, the change in phase shift in the same frequency band and attenuation range reaches 3°. Thus, the phase shift is reduced by almost 15 times due to correction.


We will consider the correction and control parameters as independent variables or factors influencing the attenuation and change in the phase shift. This makes it possible, using the Statistica system, to conduct factor and regression analysis of the SPS in order to establish physical patterns between the circuit parameters and individual characteristics, as well as simplify the search for optimal circuit parameters.

The initial data was generated as follows. For correction parameters and control resistances that differ from the optimal ones up and down on a frequency grid of 0.01–4 GHz, the introduced attenuation and the change in phase shift were calculated.

Statistical modeling methods, in particular factor and regression analysis, which have not previously been used to design discrete devices with variable states, make it possible to identify the physical patterns of operation of system elements. This facilitates the creation of a device structure based on a given optimality criterion. In particular, this section discussed the phase-invariant attenuator as a typical example of a state-variable system. Identification and interpretation of factor loadings that influence various characteristics under study makes it possible to change the traditional methodology and significantly simplify the search for correction parameters and regulation parameters.

It has been established that the use of a statistical approach to the design of such devices is justified both for assessing the physics of their operation and for justifying the circuit diagrams. Statistical modeling can significantly reduce the amount of experimental research.

Results

  • Observing common factors and corresponding factor loadings is a necessary identification of internal patterns of processes.
  • In order to determine the critical values ​​of controlled distances between factor loadings, the results of factor analysis for similar processes should be accumulated and generalized.
  • The use of factor analysis is not limited to the physical features of processes. Factor analysis is both a powerful method for monitoring processes and is applicable to the design of systems for a wide variety of purposes.

The main types of models used in financial analysis and forecasting.

Before we start talking about one of the types of financial analysis - factor analysis, let us recall what financial analysis is and what its goals are.

Financial analysis is a method for assessing the financial condition and performance of an economic entity based on studying the dependence and dynamics of financial reporting indicators.

Financial analysis has several purposes:

  • assessment of financial situation;
  • identifying changes in financial condition in space and time;
  • identification of the main factors that caused changes in financial condition;
  • forecast of main trends in financial condition.

As you know, there are the following main types of financial analysis:

  • horizontal analysis;
  • vertical analysis;
  • trend analysis;
  • method of financial ratios;
  • comparative analysis;
  • factor analysis.

Each type of financial analysis is based on the use of a model that makes it possible to evaluate and analyze the dynamics of the main indicators of the enterprise. There are three main types of models: descriptive, predicative and normative.

Descriptive models also known as descriptive models. They are fundamental for assessing the financial condition of an enterprise. These include: construction of a system of reporting balance sheets, presentation of financial statements in various analytical sections, vertical and horizontal analysis of reporting, a system of analytical coefficients, analytical notes for reporting. All these models are based on the use of accounting information.

At the core vertical analysis lies a different presentation of financial statements - in the form of relative values ​​that characterize the structure of the generalizing total indicators. An obligatory element of the analysis is the dynamic series of these quantities, which makes it possible to track and predict structural changes in the composition of economic assets and the sources of their coverage.

Horizontal analysis allows you to identify trends in changes in individual items or their groups included in the financial statements. This analysis is based on the calculation of the basic growth rates of balance sheet and income statement items.

System of analytical coefficients– the main element of financial analysis, used by various groups of users: managers, analysts, shareholders, investors, creditors, etc. There are dozens of such indicators, divided into several groups according to the main areas of financial analysis:

  • liquidity indicators;
  • financial stability indicators;
  • business activity indicators;
  • profitability indicators.

Predicative models These are predictive models. They are used to forecast a company's income and its future financial condition. The most common of them are: calculating the point of critical sales volume, constructing forecast financial reports, dynamic analysis models (strictly determined factor models and regression models), situation analysis models.

Normative models. Models of this type allow you to compare the actual results of enterprises with the expected ones calculated according to the budget. These models are used primarily in internal financial analysis. Their essence comes down to the establishment of standards for each cost item for technological processes, types of products, responsibility centers, etc. and to the analysis of deviations of actual data from these standards. The analysis is largely based on the use of strictly deterministic factor models.

As we see, modeling and analysis of factor models occupy an important place in the methodology of financial analysis. Let's consider this aspect in more detail.

Basics of modeling.

The functioning of any socio-economic system (which includes an operating enterprise) occurs in conditions of complex interaction of a complex of internal and external factors. Factor- this is the cause, the driving force of a process or phenomenon, determining its character or one of its main features.

Classification and systematization of factors in the analysis of economic activity.

The classification of factors is their distribution into groups depending on common characteristics. It allows you to gain a deeper understanding of the reasons for changes in the phenomena under study, and to more accurately assess the place and role of each factor in the formation of the value of effective indicators.

The factors studied in the analysis can be classified according to different criteria.

By their nature, factors are divided into natural, socio-economic and production-economic.

Natural factors have a great influence on the results of activities in agriculture, forestry and other industries. Taking into account their influence makes it possible to more accurately assess the results of the work of business entities.

Socio-economic factors include the living conditions of workers, the organization of health-improving work at enterprises with hazardous production, the general level of personnel training, etc. They contribute to a more complete use of the enterprise’s production resources and increase the efficiency of its work.

Production and economic factors determine the completeness and efficiency of use of the enterprise's production resources and the final results of its activities.

Based on the degree of impact on the results of economic activity, factors are divided into major and minor. The main ones include factors that have a decisive impact on the performance indicator. Those that do not have a decisive impact on the results of economic activity in the current conditions are considered secondary. It should be noted that, depending on the circumstances, the same factor can be both primary and secondary. The ability to identify the main ones from the entire set of factors ensures the correctness of the conclusions based on the results of the analysis.

Factors are divided into internal And external, depending on whether the activities of a given enterprise affect them or not. The analysis focuses on internal factors that the enterprise can influence.

Factors are divided into objective, independent of the will and desires of people, and subjective, subject to the influence of the activities of legal entities and individuals.

According to the degree of prevalence, factors are divided into general and specific. Common factors operate in all sectors of the economy. Specific factors operate within a particular industry or a specific enterprise.

In the process of an organization's work, some factors influence the indicator under study continuously throughout the entire time. Such factors are called permanent. Factors whose influence appears periodically are called variables(this is, for example, the introduction of new technology, new types of products).

Of great importance for assessing the activities of enterprises is the division of factors according to the nature of their action into intensive And extensive. Extensive factors include factors that are associated with changes in quantitative, rather than qualitative, characteristics of the functioning of an enterprise. An example is an increase in the volume of production due to an increase in the number of workers. Intensive factors characterize the qualitative side of the production process. An example would be an increase in production volume by increasing the level of labor productivity.

Most of the factors studied are complex in composition and consist of several elements. However, there are also those that cannot be broken down into their component parts. In this regard, factors are divided into complex (complex) And simple (elemental). An example of a complex factor is labor productivity, and a simple one is the number of working days in the reporting period.

Based on the level of subordination (hierarchy), factors of the first, second, third and subsequent levels of subordination are distinguished. TO first level factors These include those that directly affect the performance indicator. Factors that influence the performance indicator indirectly, with the help of first-level factors, are called second level factors etc.

It is clear that when studying the influence of any group of factors on the work of an enterprise, it is necessary to organize them, that is, to carry out an analysis taking into account their internal and external connections, interaction and subordination. This is achieved through systematization. Systematization is the placement of the phenomena or objects being studied in a certain order, identifying their relationship and subordination.

Creation factor systems is one of the ways of such systematization of factors. Let's consider the concept of a factor system.

Factor systems

All phenomena and processes of economic activity of enterprises are interdependent. Relationship between economic phenomena is a joint change in two or more phenomena. Among the many forms of regular relationships, an important role is played by cause-and-effect (deterministic), in which one phenomenon gives rise to another.

In the economic activity of an enterprise, some phenomena are directly related to each other, others - indirectly. For example, the amount of gross output is directly influenced by factors such as the number of workers and the level of their labor productivity. Many other factors indirectly affect this indicator.

In addition, each phenomenon can be considered as a cause and as a consequence. For example, labor productivity can be considered, on the one hand, as the reason for changes in production volume and the level of its cost, and on the other hand, as a result of changes in the degree of mechanization and automation of production, improvement in labor organization, etc.

Quantitative characterization of interrelated phenomena is carried out using indicators. Indicators characterizing the cause are called factorial (independent); indicators characterizing the consequence are called effective (dependent). The set of factor and resultant characteristics related by cause and effect is called factor system.

Modeling any phenomenon is the construction of a mathematical expression of an existing relationship. Modeling is one of the most important methods of scientific knowledge. There are two types of dependencies studied in the process of factor analysis: functional and stochastic.

A relationship is called functional, or strictly determined, if each value of a factor characteristic corresponds to a well-defined non-random value of the resultant characteristic.

A relationship is called stochastic (probabilistic) if each value of a factor characteristic corresponds to a set of values ​​of the resulting characteristic, i.e., a certain statistical distribution.

Model factor system is a mathematical formula that expresses real connections between the analyzed phenomena. In general, it can be presented as follows:

where is the resultant sign;

Factor signs.

Thus, each performance indicator depends on numerous and varied factors. The basis of economic analysis and its section is factor analysis- identify, evaluate and predict the influence of factors on changes in the performance indicator. The more detailed the dependence of the performance indicator on certain factors is studied, the more accurate the results of the analysis and assessment of the quality of the enterprises’ work. Without a deep and comprehensive study of factors, it is impossible to draw reasonable conclusions about the results of activities, identify production reserves, and justify plans and management decisions.

Factor analysis, its types and tasks.

Under factor analysis understands the methodology for a comprehensive and systematic study and measurement of the impact of factors on the value of performance indicators.

In general, the following can be distinguished: main stages of factor analysis:

  1. Setting the purpose of the analysis.
  2. Selection of factors that determine the performance indicators under study.
  3. Classification and systematization of factors in order to provide an integrated and systematic approach to the study of their influence on the results of economic activity.
  4. Determination of the form of dependence between factors and the performance indicator.
  5. Modeling the relationships between performance and factor indicators.
  6. Calculation of the influence of factors and assessment of the role of each of them in changing the value of the performance indicator.
  7. Working with the factor model (its practical use for managing economic processes).

Selection of factors for analysis of a particular indicator is carried out on the basis of theoretical and practical knowledge in a particular industry. In this case, they usually proceed from the principle: the larger the complex of factors studied, the more accurate the results of the analysis will be. At the same time, it must be borne in mind that if this complex of factors is considered as a mechanical sum, without taking into account their interaction, without identifying the main, determining ones, then the conclusions may be erroneous. In business activity analysis (ABA), an interconnected study of the influence of factors on the value of performance indicators is achieved through their systematization, which is one of the main methodological issues of this science.

An important methodological issue in factor analysis is determining the form of dependence between factors and performance indicators: functional or stochastic, direct or inverse, linear or curvilinear. It uses theoretical and practical experience, as well as methods for comparing parallel and dynamic series, analytical groupings of source information, graphical, etc.

Modeling of economic indicators also represents a complex problem in factor analysis, the solution of which requires special knowledge and skills.

Calculation of the influence of factors- the main methodological aspect in ACD. To determine the influence of factors on the final indicators, many methods are used, which will be discussed in more detail below.

The last stage of factor analysis is practical use of the factor model to calculate reserves for the growth of the effective indicator, to plan and predict its value when the situation changes.

Depending on the type of factor model, there are two main types of factor analysis - deterministic and stochastic.

is a technique for studying the influence of factors whose connection with the effective indicator is functional in nature, that is, when the effective indicator of the factor model is presented in the form of a product, quotient or algebraic sum of factors.

This type of factor analysis is the most common, since, being quite simple to use (compared to stochastic analysis), it allows you to understand the logic of the action of the main factors of enterprise development, quantify their influence, understand which factors and in what proportion it is possible and advisable to change to increase production efficiency. We will consider deterministic factor analysis in detail in a separate chapter.

Stochastic analysis is a methodology for studying factors whose connection with a performance indicator, unlike a functional one, is incomplete and probabilistic (correlation). If with a functional (complete) dependence with a change in the argument there is always a corresponding change in the function, then with a correlation connection a change in the argument can give several values ​​of the increase in the function depending on the combination of other factors that determine this indicator. For example, labor productivity at the same level of capital-labor ratio may be different at different enterprises. This depends on the optimal combination of other factors affecting this indicator.

Stochastic modeling is, to a certain extent, a complement and deepening of deterministic factor analysis. In factor analysis, these models are used for three main reasons:

  • it is necessary to study the influence of factors for which it is impossible to build a strictly determined factor model (for example, the level of financial leverage);
  • it is necessary to study the influence of complex factors that cannot be combined in the same strictly determined model;
  • it is necessary to study the influence of complex factors that cannot be expressed by one quantitative indicator (for example, the level of scientific and technological progress).

In contrast to the strictly deterministic approach, the stochastic approach requires a number of prerequisites for implementation:

  1. the presence of a population;
  2. sufficient volume of observations;
  3. randomness and independence of observations;
  4. homogeneity;
  5. the presence of a distribution of characteristics close to normal;
  6. the presence of a special mathematical apparatus.

The construction of a stochastic model is carried out in several stages:

  • qualitative analysis (setting the purpose of the analysis, defining the population, determining the effective and factor characteristics, choosing the period for which the analysis is carried out, choosing the analysis method);
  • preliminary analysis of the simulated population (checking the homogeneity of the population, excluding anomalous observations, clarifying the required sample size, establishing distribution laws for the indicators being studied);
  • construction of a stochastic (regression) model (clarification of the list of factors, calculation of estimates of regression equation parameters, enumeration of competing model options);
  • assessment of the adequacy of the model (checking the statistical significance of the equation as a whole and its individual parameters, checking the compliance of the formal properties of the estimates with the objectives of the study);
  • economic interpretation and practical use of the model (determining the spatio-temporal stability of the constructed relationship, assessing the practical properties of the model).

In addition to dividing into deterministic and stochastic, the following types of factor analysis are distinguished:

    • direct and reverse;
    • single-stage and multi-stage;
    • static and dynamic;
    • retrospective and prospective (forecast).

At direct factor analysis The research is conducted in a deductive manner - from the general to the specific. Reverse factor analysis carries out the study of cause-and-effect relationships using the method of logical induction - from particular, individual factors to general ones.

Factor analysis can be single stage And multi-stage. The first type is used to study factors of only one level (one level) of subordination without detailing them into their component parts. For example, . In multi-stage factor analysis, factors are detailed a And b into constituent elements in order to study their behavior. The detailing of factors can be continued further. In this case, the influence of factors at different levels of subordination is studied.

It is also necessary to distinguish static And dynamic factor analysis. The first type is used when studying the influence of factors on performance indicators on the corresponding date. Another type is a technique for studying cause-and-effect relationships in dynamics.

Finally, factor analysis can be retrospective, which studies the reasons for the increase in performance indicators over past periods, and promising, which examines the behavior of factors and performance indicators in perspective.

Deterministic factor analysis.

Deterministic factor analysis has a fairly strict sequence of procedures:

  • construction of an economically sound deterministic factor model;
  • choosing a factor analysis technique and preparing conditions for its implementation;
  • implementation of counting procedures for model analysis;
  • formulating conclusions and recommendations based on the results of the analysis.

The first stage is especially important, since an incorrectly constructed model can lead to logically unjustified results. The meaning of this stage is as follows: any expansion of a strictly determined factor model should not contradict the logic of the “cause-effect” relationship. As an example, consider a model linking sales volume (P), headcount (H) and labor productivity (LP). Theoretically, three models can be explored:

All three formulas are correct from the point of view of arithmetic, but from the point of view of factor analysis, only the first one makes sense, since in it the indicators on the right side of the formula are factors, i.e. the cause that generates and determines the value of the indicator on the left side (consequence ).

At the second stage, one of the methods of factor analysis is selected: integral, chain substitutions, logarithmic, etc. Each of these methods has its own advantages and disadvantages. We will consider a brief comparative description of these methods below.

Types of deterministic factor models.

The following deterministic analysis models exist:

additive model, i.e., a model in which factors are included in the form of an algebraic sum; an example is the commodity balance model:

Where R- implementation;

Inventory at the beginning of the period;

P- receipt of goods;

Ending inventory;

IN- other disposal of goods;

multiplicative model, i.e., a model in which factors are included in the form of a product; An example is the simplest two-factor model:

Where R- implementation;

H- number;

PT- labor productivity;

multiple model, i.e., a model representing a relationship of factors, for example:

where is the capital-labor ratio;

OS

H- number;

mixed model, i.e. a model in which factors are included in various combinations, for example:

,

Where R- implementation;

Profitability;

OS- cost of fixed assets;
About- cost of working capital.

A strictly deterministic model that has more than two factors is called multifactorial.

Typical problems of deterministic factor analysis.

In deterministic factor analysis, four typical problems can be distinguished:

  1. Assessing the influence of relative changes in factors on the relative changes in the performance indicator.
  2. Assessing the impact of an absolute change in the i-th factor on the absolute change in a performance indicator.
  3. Determining the ratio of the change in the effective indicator caused by a change in the i-th factor to the base value of the effective indicator.
  4. Determination of the share of the absolute change in the performance indicator caused by the change in the i-th factor in the total change in the performance indicator.

Let us characterize these problems and consider the solution to each of them using a specific simple example.

Example.

The volume of gross output (GP) depends on two main factors of the first level: the number of employees (NH) and average annual output (AG). We have a two-factor multiplicative model: . Let's consider a situation where both production and the number of workers in the reporting period deviated from the planned values.

Data for calculations are given in Table 1.

Table 1. Data for factor analysis of gross output volume.

Task 1.

The problem makes sense for multiplicative and multiple models. Let's consider the simplest two-factor model. Obviously, when analyzing the dynamics of these indicators, the following relationship between the indices will be fulfilled:

where the index value is the ratio of the indicator value in the reporting period to the base one.

Let's calculate the indices of gross output, number of employees and average annual output for our example:

;

.

According to the above rule, the gross output index is equal to the product of the indices of the number of workers and average annual output, i.e.

Obviously, if we calculate the gross output index directly, we will get the same value:

.

We can conclude: as a result of an increase in the number of employees by 1.2 times and an increase in average annual output by 1.25 times, the volume of gross output increased by 1.5 times.

Thus, relative changes in factor and performance indicators are related by the same relationship as the indicators in the original model. This problem is solved by answering questions like: “What will happen if the i-th indicator changes by n%, and the j-th indicator changes by k%?”

Task 2.

Is main task deterministic factor analysis; its general formulation has the form:

Let - a strictly determined model that characterizes the change in the performance indicator y from n factors; all indicators received an increase (for example, in dynamics, compared to the plan, compared to the standard):

It is required to determine what part of the increment of the effective indicator y is obliged to increase the i-th factor, i.e. write the following dependence:

where is the general change in the performance indicator, which develops under the simultaneous influence of all factor characteristics;

The change in the performance indicator is influenced only by the factor.

Depending on which method of model analysis is chosen, factor decompositions may differ. Therefore, in the context of this task, let us consider the main methods of analyzing factor models.

Basic methods of deterministic factor analysis.

One of the most important methodological factors in ACD is determining the magnitude of the influence of individual factors on the increase in performance indicators. In deterministic factor analysis (DFA), the following methods are used for this: identifying the isolated influence of factors, chain substitution, absolute differences, relative differences, proportional division, integral, logarithm, etc.

The first three methods are based on the elimination method. Eliminate means to eliminate, reject, exclude the influence of all factors on the value of the effective indicator, except one. This method is based on the fact that all factors change independently of each other: first one changes, and all others remain unchanged, then two change, then three, etc., while the rest remain unchanged. This allows us to determine the influence of each factor on the value of the indicator under study separately.

Let's give a brief description of the most common methods.

The chain substitution method is a very simple and visual method, the most universal of all. It is used to calculate the influence of factors in all types of deterministic factor models: additive, multiplicative, multiple and mixed. This method allows you to determine the influence of individual factors on changes in the value of the performance indicator by gradually replacing the base value of each factor indicator in the scope of the performance indicator with the actual value in the reporting period. For this purpose, a number of conditional values ​​of the performance indicator are determined, which take into account changes in one, then two, then three, etc. factors, assuming that the rest do not change. Comparing the value of an effective indicator before and after changing the level of a particular factor allows us to determine the impact of a specific factor on the increase in the effective indicator, excluding the influence of other factors. Using this method, complete decomposition is achieved.

Let us recall that when using this method, the order in which the values ​​of the factors change is of great importance, since the quantitative assessment of the influence of each factor depends on this.

First of all, it should be noted that there is not and cannot exist a single method for determining this order - there are models in which it can be determined arbitrarily. Only for a small number of models can formalized approaches be used. In practice, this problem is not of great importance, since in retrospective analysis it is important to trends and the relative importance of this or that factor, and not to precise estimates of their influence.

Nevertheless, to maintain a more or less uniform approach to determining the order of replacement of factors in the model, general principles can be formulated. Let us introduce some definitions.

A sign that is directly related to the phenomenon under study and characterizes its quantitative aspect is called primary or quantitative. These signs are: a) absolute (volumetric); b) they can be summed up in space and time. Examples include sales volume, headcount, cost of working capital, etc.

Features that relate to the phenomenon under study not directly, but through one or more other features and characterize the qualitative side of the phenomenon being studied are called secondary or high quality. These signs are: a) relative; b) they cannot be summed up in space and time. Examples include capital-labor ratio, profitability, etc. The analysis identifies secondary factors of the 1st, 2nd, etc. orders, obtained by sequential detailing.

A strictly determined factor model is called complete if the effective indicator is quantitative, and incomplete if the effective indicator is qualitative. In a complete two-factor model, one factor is always quantitative, the second is qualitative. In this case, it is recommended to start replacing factors with a quantitative indicator. If there are several quantitative and several qualitative indicators, then you should first change the value of the factors of the first level of subordination, and then the lower one. Thus, the use of the chain substitution method requires knowledge of the relationship of factors, their subordination, and the ability to correctly classify and systematize them.

Now, using our example, let’s look at the procedure for applying the chain substitution method.

The calculation algorithm using the chain substitution method for this model is as follows:

As you can see, the second indicator of gross output differs from the first in that when calculating it, the actual number of workers was taken instead of the planned one. The average annual output per worker in both cases is planned. This means that due to the increase in the number of workers, production output increased by 32,000 million rubles. (192,000 - 160,000).

The third indicator differs from the second in that when calculating its value, the output of workers is taken at the actual level instead of the planned one. The number of employees in both cases is actual. Hence, due to increased labor productivity, the volume of gross output increased by 48,000 million rubles. (240,000 - 192,000).

Thus, exceeding the plan for gross output was the result of the influence of the following factors:

The algebraic sum of factors when using this method must necessarily be equal to the total increase in the effective indicator:

The absence of such equality indicates errors in the calculations.

Other methods of analysis, such as integral and logarithmic, can achieve higher accuracy of calculations, but these methods have a more limited scope and require a large amount of calculations, which is inconvenient for conducting operational analysis.

Task 3.

In a certain sense, it is a consequence of the second standard problem, since it is based on the resulting factor decomposition. The need to solve this problem is due to the fact that the elements of factor decomposition are absolute values ​​that are difficult to use for spatio-temporal comparisons. When solving problem 3, the factor decomposition is supplemented with relative indicators:

.

Economic interpretation: the coefficient shows by what percentage compared to the base level the performance indicator has changed under the influence of the i-th factor.

Let's calculate the coefficients α for our example, using the factor decomposition obtained earlier by the method of chain substitutions:

;

Thus, the volume of gross output increased by 20% due to an increase in the number of workers and by 30% due to an increase in output. The total increase in gross output was 50%.

Task 4.

It is also solved on the basis of basic problem 2 and comes down to calculating the indicators:

.

Economic interpretation: the coefficient shows the share of the increase in the performance indicator due to the change in the i-th factor. There is no question here if all factor characteristics change unidirectionally (either increase or decrease). If this condition is not met, solving the problem may be complicated. In particular, in the simplest two-factor model, in such a case, the calculation according to the given formula is not performed and it is considered that 100% of the increase in the effective indicator is due to a change in the dominant factor characteristic, i.e., a characteristic that changes in the same direction as the effective indicator.

Let's calculate the coefficients γ for our example, using the factor decomposition obtained by the chain substitution method:

Thus, the increase in the number of workers accounted for 40% of the total increase in gross output, and the increase in output - 60%. This means that an increase in production in this situation is the determining factor.

All processes occurring in business are interconnected. There is both a direct and indirect connection between them. Various economic parameters change under the influence of various factors. Factor analysis (FA) allows you to identify these indicators, analyze them, and study the degree of influence.

The concept of factor analysis

Factor analysis is a multidimensional technique that allows you to study the relationships between the parameters of variables. In the process, the structure of covariance or correlation matrices is studied. Factor analysis is used in a variety of sciences: psychometrics, psychology, economics. The basics of this method were developed by psychologist F. Galton.

Objectives of the

To obtain reliable results, a person needs to compare indicators on several scales. In the process, the correlation of the obtained values, their similarities and differences is determined. Let's consider the basic tasks of factor analysis:

  • Detection of existing values.
  • Selection of parameters for a complete analysis of values.
  • Classification of indicators for system work.
  • Detection of relationships between resultant and factor values.
  • Determining the degree of influence of each factor.
  • Analysis of the role of each value.
  • Application of the factor model.

Every parameter that affects the final value must be examined.

Factor analysis techniques

FA methods can be used both in combination and separately.

Deterministic Analysis

Deterministic analysis is used most often. This is due to the fact that it is quite simple. Allows you to identify the logic of the impact of the company’s main factors and analyze their impact in quantitative terms. As a result of the DA, you can understand what factors should be changed to improve the company's performance. Advantages of the method: versatility, ease of use.

Stochastic analysis

Stochastic analysis allows you to analyze existing indirect relationships. That is, there is a study of indirect factors. The method is used if it is impossible to find direct connections. Stochastic analysis is considered complementary. It is only used in certain cases.

What is meant by indirect connections? With a direct connection, when the argument changes, the value of the factor will also change. An indirect connection involves a change in the argument followed by a change in several indicators at once. The method is considered auxiliary. This is due to the fact that experts recommend studying direct connections first. They allow you to create a more objective picture.

Stages and features of factor analysis

Analysis for each factor gives objective results. However, it is used extremely rarely. This is due to the fact that complex calculations are performed in the process. To carry them out you will need special software.

Let's consider the stages of FA:

  1. Establishing the purpose of the calculations.
  2. Selection of values ​​that directly or indirectly affect the final result.
  3. Classification of factors for complex research.
  4. Detecting the relationship between the selected parameters and the final indicator.
  5. Modeling of mutual relationships between the result and the factors influencing it.
  6. Determining the degree of impact of the values ​​and assessing the role of each parameter.
  7. Use of the generated factor table in the activities of the enterprise.

NOTE! Factor analysis involves very complex calculations. Therefore, it is better to entrust it to a professional.

IMPORTANT! When carrying out calculations, it is extremely important to correctly select factors that influence the results of the enterprise. The selection of factors depends on the specific area.

Factor analysis of profitability

A profitability analysis is carried out to analyze the rationality of resource allocation. As a result, it is possible to determine which factors most influence the final result. As a result, only those factors that best influence efficiency can be retained. Based on the data obtained, you can change the company's pricing policy. The following factors may influence the cost of production:

  • fixed costs;
  • variable costs;
  • profit.

Reducing costs provokes an increase in profits. In this case, the cost does not change. We can conclude that profitability is affected by existing costs, as well as the volume of products sold. Factor analysis allows us to determine the degree of influence of these parameters. When does it make sense to do it? The main reason for this is to reduce or increase profitability.

Factor analysis is carried out using the following formula:

Rв= ((W-SB -KRB-URB)/W) - (WB-SB-KRB-URB)/WB, Where:

VT – revenue for the current period;

SB – cost price for the current period;

KRB – commercial expenses for the current period;

URB – management expenses for the previous period;

VB – revenue for the previous period;

KRB – commercial expenses for the previous period.

Other formulas

Let's consider the formula for calculating the degree of impact of cost on profitability:

Rс= ((W-SBot -KRB-URB)/W) - (W-SB-KRB-URB)/W,

CBO is the cost of production for the current period.

Formula for calculating the impact of management expenses:

Rur= ((W-SB -KRB-URot)/W) - (W-SB-KRB-URB)/W,

URot is management expenses.

The formula for calculating the impact of business costs is:

Rк= ((W-SB -KRo-URB)/W) - (W-SB-KRB-URB)/W,

CR is commercial expenses for the previous time.

The total impact of all factors is calculated using the following formula:

Rob=Rv+Rс+Rur+Rk.

IMPORTANT! When making calculations, it makes sense to calculate the influence of each factor separately. Overall PA results are of little value.

Example

Let's consider the organization's indicators for two months (for two periods, in rubles). In July, the organization's income amounted to 10 thousand, production costs - 5 thousand, administrative expenses - 2 thousand, commercial expenses - 1 thousand. In August, the company's income amounted to 12 thousand, production costs - 5.5 thousand, administrative expenses - 1.5 thousand, commercial expenses - 1 thousand. The following calculations are carried out:

R=((12 thousand-5.5 thousand-1 thousand-2 thousand)/12 thousand)-((10 thousand-5.5 thousand-1 thousand-2 thousand)/10 thousand)=0.29-0, 15=0.14

From these calculations we can conclude that the organization’s profit increased by 14%.

Factor analysis of profit

P = RR + RF + RVN, where:

P – profit or loss;

РР – profit from sales;

RF – results of financial activities;

RVN is the balance of income and expenses from non-operating activities.

Then you need to determine the result from the sale of goods:

PP = N – S1 – S2, where:

N – revenue from the sale of goods at selling prices;

S1 – cost of products sold;

S2 – commercial and administrative expenses.

The key factor in calculating profit is the sales turnover of the company.

NOTE! Factor analysis is extremely difficult to perform manually. You can use special programs for it. The simplest program for calculations and automatic analysis is Microsoft Excel. It has tools for analysis.

All economic processes of enterprises are interconnected and interdependent. Some of them are directly related to each other, some appear indirectly. Thus, an important issue in economic analysis is the assessment of the influence of a factor on a particular economic indicator, and for this purpose factor analysis is used.

Factor analysis of the enterprise. Definition. Goals. Species

Factor analysis refers in the scientific literature to the section of multivariate statistical analysis, where the assessment of observed variables is carried out using covariance or correlation matrices.

Factor analysis was first used in psychometrics and is currently used in almost all sciences, from psychology to neurophysiology and political science. The basic concepts of factor analysis were defined by the English psychologist Galton and then developed by Spearman, Thurstone, and Cattell.

You can select 2 goals of factor analysis:
– determination of the relationship between variables (classification).
– reducing the number of variables (clustering).

Factor analysis of the enterprise– a comprehensive methodology for systematically studying and assessing the impact of factors on the value of the performance indicator.

The following can be distinguished types of factor analysis:

  1. Functional, where the effective indicator is defined as a product or an algebraic sum of factors.
  2. Correlation (stochastic) – the relationship between the performance indicator and the factors is probabilistic.
  3. Direct / Reverse – from general to specific and vice versa.
  4. Single-stage/multi-stage.
  5. Retrospective/prospective.

Let's look at the first two in more detail.

In order to be able to carry out factor analysis is necessary:
– All factors must be quantitative.
– The number of factors is 2 times greater than the performance indicators.
– Homogeneous sample.
– Normal distribution of factors.

Factor analysis carried out in several stages:
Stage 1. Factors are selected.
Stage 2. Factors are classified and systematized.
Stage 3. The relationship between the performance indicator and the factors is modeled.
Stage 4. Assessing the influence of each factor on the performance indicator.
Stage 5. Practical use of the model.

Methods of deterministic factor analysis and methods of stochastic factor analysis are distinguished.

Deterministic factor analysis– a study in which factors influence the performance indicator functionally. Methods of deterministic factor analysis - the method of absolute differences, the method of logarithm, the method of relative differences. This type of analysis is the most common due to its ease of use and allows you to understand the factors that need to be changed to increase/decrease the performance indicator.

Stochastic factor analysis– a study in which factors influence the performance indicator probabilistically, i.e. when a factor changes, there may be several values ​​(or a range) of the resulting indicator. Methods of stochastic factor analysis - game theory, mathematical programming, multiple correlation analysis, matrix models.

Basic provisions

Factor analysis is one of the new sections of multivariate statistical analysis. This method was originally developed to explain the correlation between input parameters. The result of the correlation analysis is a matrix of correlation coefficients. If the number of features (variables) is small, you can conduct a visual analysis of this matrix. As the number of signs increases (10 or more), visual analysis will not give positive results. It turns out that the whole variety of correlations can be explained by the action of several generalized factors, which are functions of the parameters under study, while the factors themselves may be unknown, but they can be expressed through the characteristics being studied. The founder of factor analysis is the American scientist L. Thurstone.

Modern statisticians understand factor analysis as a set of methods that, on the basis of a real-life connection between characteristics, allows one to identify latent (hidden) generalizing characteristics of the organizational structure and mechanisms of development of the phenomena and processes being studied.

Example: suppose that n cars are assessed based on 2 criteria:

x 1 – cost of the car,

x 2 – duration of the motor’s working life.

Provided that x 1 and x 2 are correlated, a directed and fairly dense cluster of points appears in the coordinate system, formally displayed by the new axes and (Fig. 5).

Fig.6

Feature F 1 and F 2 is that they pass through dense clusters of points and in turn correlate with x 1 x 2.Maximum

the number of new axes will be equal to the number of elementary features. Further developments in factor analysis showed that this method can be successfully applied in problems of grouping and classifying objects.

Presentation of information in factor analysis.

To carry out factor analysis, information must be presented in the form of a matrix of size m x n:

The rows of the matrix correspond to observation objects (i=), and the columns correspond to features (j=).

Features that characterize an object have different dimensions. In order to bring them to the same dimension and ensure comparability of features, the source data matrix is ​​usually normalized by introducing a single scale. The most common method of normalization is standardization. From variables go to variables

Average value j sign,

Standard deviation.

This transformation is called standardization.

Basic factor analysis model

The basic factor analysis model has the form:

z j – j- sign (random value);

F 1 , F 2 , …, F p– general factors (random, normally distributed values);

u j– characteristic factor;

j1 , j2 , …, jp loading factors characterizing the significance of the influence of each factor (model parameters to be determined);

General factors are essential for the analysis of all characteristics. Characteristic factors show that it relates only to a given characteristic; this is the specificity of the characteristic, which cannot be expressed through factors. Factor loadings j1 , j2 , …, jp characterize the magnitude of the influence of one or another general factor in the variation of a given characteristic. The main task of factor analysis is to determine factor loadings. Variance S j 2 of each characteristic can be divided into 2 components:

    the first part determines the action of general factors - the commonality of h j 2;

    the second part determines the action of the characteristic factor - characteristic - d j 2.

All variables are presented in a standardized form, so the variance - state sign S j2 = 1.

If general and characteristic factors do not correlate with each other, then the dispersion of the jth characteristic can be represented as:

where is the proportion of trait variance attributable to k-th factor.

The total contribution of any factor to the total variance is equal to:

Contribution of all common factors to the total variance:

It is convenient to present the results of factor analysis in the form of a table.

Factor loadings

Commonalities

a 11 a 21 ... a p1

a 12 a 22 a p2

… … … …

a 1m a 2m a pm

factors

V 1 V 2 ...V p

A- matrix of factor loadings. It can be obtained in various ways; currently the most widely used method is the method of principal components or principal factors.

Computational procedure of the principal factor method.

Solving the problem using principal components comes down to a step-by-step transformation of the source data matrix X :

X- source data matrix;

Z– matrix of standardized feature values,

R– matrix of pair correlations:

Diagonal matrix of eigen(characteristic) numbers,

j found by solving the characteristic equation

E– identity matrix,

 j – dispersion indicator of each principal component,

subject to standardization of the source data, then= m

U– matrix of eigenvectors, which are found from the equation:

In reality this means a solution m systems of linear equations for each

Those. Each eigenvalue corresponds to a system of equations.

Then they find V- matrix of normalized eigenvectors.

The factor mapping matrix A is calculated using the formula:

Then we find the values ​​of the principal components using one of the equivalent formulas:

A set of four industrial enterprises was assessed according to three characteristic features:

    average annual output per employee x 1;

    level of profitability x 2;

Capital productivity level x 3.

The result is presented in a standardized matrix Z:

By matrix Z a matrix of pair correlations was obtained R:

    Let’s find the determinant of the pairwise correlation matrix (for example, using Faddeev’s method):

    Let's construct a characteristic equation:

    Solving this equation we find:

Thus, the initial elementary characteristics x 1, x 2, x 3 can be generalized by the values ​​of the three main components, and:

F 1 explains approximately all the variation,

F 2 - , a F 3 -

All three principal components explain the variations completely 100%.

Solving this system we find:

Systems for  2 and  3 are constructed similarly. For  2 system solution:

Eigenvector matrix U takes the form:

    We divide each element of the matrix by the sum of the squares of the elements of the jth

column, we get a normalized matrix V.

Note that the equality = E.

    We obtain the factor mapping matrix from the matrix relation

=

The meaning of each element of the matrix A represents the partial coefficients of the correlation matrix between the original feature x j and main components F r. Therefore, all elements .

The equality implies the condition r- number of components.

The total contribution of each factor to the total variance of characteristics is equal to:

The factor analysis model will take the form:

Let's find the values ​​of the main components (matrix F) according to the formula

The center of the distribution of values ​​of the principal components is at the point (0,0,0).

Further, analytical conclusions based on the calculation results follow after making a decision on the number of significant features and main components and determining the names of the main components. The tasks of recognizing the main components and determining names for them are solved subjectively based on weighting coefficients from the mapping matrix A.

Let's consider the issue of formulating the names of the main components.

Let's denote w 1 – a set of insignificant weighting coefficients, which includes elements close to zero,

w 2 - set of significant weighting coefficients,

w 3 – a subset of significant weighting coefficients that are not involved in the formation of the name of the main component.

w 2 - w 3 – a subset of weighting coefficients involved in the formation of the name.

We calculate the information content coefficient for each main factor

We consider a set of explainable features to be satisfactory if the values ​​of the informativeness coefficients lie in the range of 0.75-0.95.

a 11 =0,776 a 12 =-0,130 a 13 =0,308

a 12 =0,904 a 22 =-0,210 a 23 =-0,420

A 31 =0,616 A 32 =0,902 A 33 =0,236

For j=1 w 1 = ,w 2 ={a 11 ,a 21 ,a 31 },

.

For j=2 w 1 ={a 12 ,a 22 }, w 2 ={ A 32 },

For j=3 w 1 ={A 33 }, w 2 ={a 13 ,a 33 },

Feature values x 1 , x 2 , x 3 the composition of the main component is determined to be 100%. in this case, the largest contribution of the feature x 2, the meaning of which is profitability. correct for the attribute name F 1 will be production efficiency.

F 2 is determined by the component x 3 (capital productivity), let's call it efficiency of use of fixed production assets.

F 3 determined by components x 1 ,x 2 – may not be considered in the analysis because it explains only 10% of the total variation.

Literature.

    Popov A.A.

Excel: Practical Guide, DES COM.-M.-2000.

    Dyakonov V.P., Abramenkova I.V. Mathcad7 in mathematics, physics and the Internet. Publishing house "Nomidzh", M.-1998, section 2.13. Performing regression.

    L.A. Soshnikova, V.N. Tomashevich et al. Multivariate statistical analysis in economics, ed. V.N. Tomashevich. - M. -Nauka, 1980.

    Kolemaev V.A., O.V. Staroverov, V.B. Turundaevsky Probability theory and mathematical statistics. –M. – Higher school - 1991.

    To Iberla. Factor analysis.-M. Statistics - 1980.

Comparison of two normal population means whose variances are known

Let the general populations X and Y be normally distributed, and their variances are known (for example, from previous experience or found theoretically). Based on independent samples of volumes n and m, extracted from these populations, sample means x in and y in were found.

It is required to use sample means at a given level of significance to test the null hypothesis, which is that the general means (mathematical expectations) of the populations under consideration are equal to each other, i.e. H 0: M(X) = M(Y).

Considering that sample means are unbiased estimates of general means, i.e. M(x in) = M(X) and M(y in) = M(Y), the null hypothesis can be written as follows: H 0: M(x in ) = M(y in).

Thus, it is necessary to check that the mathematical expectations of the sample means are equal to each other. This task is posed because, as a rule, sample means are different. The question arises: do the sample means differ significantly or insignificantly?

If it turns out that the null hypothesis is true, that is, the general means are the same, then the difference in the sample means is insignificant and is explained by random reasons and, in particular, by the random selection of sample objects.

If the null hypothesis is rejected, i.e., the general means are not the same, then the difference in the sample means is significant and cannot be explained by random reasons. This is explained by the fact that the general averages themselves (mathematical expectations) are different.

As a test of the null hypothesis, we will take a random variable.

The Z criterion is a normalized normal random variable. Indeed, the value Z is normally distributed, since it is a linear combination of the normally distributed values ​​X and Y; these values ​​themselves are normally distributed as sample means found from samples drawn from general populations; Z is a normalized value, because M(Z) = 0, if the null hypothesis is true, D(Z) = 1, since the samples are independent.

The critical region is constructed depending on the type of competing hypothesis.

First case. Null hypothesis H 0:M(X)=M(Y). Competing hypothesis H 1: M(X) ¹M(Y).

In this case, a two-sided critical region is constructed based on the requirement that the probability of the criterion falling into this region, assuming the null hypothesis is true, is equal to the accepted significance level.

The greatest power of the criterion (the probability of the criterion falling into the critical region if the competing hypothesis is true) is achieved when the “left” and “right” critical points are chosen so that the probability of the criterion falling into each interval of the critical region is equal to:

P(Z< zлев.кр)=a¤2,

P(Z > zright.cr)=a¤2. (1)

Since Z is a normalized normal quantity, and the distribution of such a quantity is symmetrical about zero, the critical points are symmetrical about zero.

Thus, if we denote the right boundary of the two-sided critical region by zcr, then the left boundary is zcr.

So, it is enough to find the right boundary to find the two-sided critical region Z itself< -zкр, Z >zcr and the area of ​​acceptance of the null hypothesis (-zcr, zcr).

Let us show how to find zcr - the right boundary of the two-sided critical region, using the Laplace function Ф(Z). It is known that the Laplace function determines the probability of a normalized normal random variable, for example Z, falling in the interval (0;z):

P(0< Z

Since the distribution of Z is symmetric about zero, the probability of Z falling into the interval (0; ¥) is equal to 1/2. Consequently, if we divide this interval by the point zcr into the interval (0, zcr) and (zcr, ¥), then by the addition theorem P(0< Z < zкр)+Р(Z >zcr)=1/2.

By virtue of (1) and (2), we obtain Ф(zкр)+a/2=1/2. Therefore, Ф(zкр) =(1-a)/2.

Hence we conclude: in order to find the right boundary of the two-sided critical region (zcr), it is enough to find the value of the argument of the Laplace function, which corresponds to the value of the function equal to (1-a)/2.

Then the two-sided critical region is determined by the inequalities Z< – zкр, Z >zcr, or the equivalent inequality ½Z½ > zcr, and the range of acceptance of the null hypothesis by the inequality – zcr< Z < zкр или равносильным неравенством çZ ç< zкр.

Let us denote the value of the criterion calculated from observational data by zobserved and formulate a rule for testing the null hypothesis.

Rule.

1. Calculate the observed criterion value

2. Using the table of the Laplace function, find the critical point by the equality Ф(zкр)=(1-a)/2.

3. If ç zobserved ç< zкр – нет оснований отвергнуть нулевую гипотезу.

If ç zob ç> zcr, the null hypothesis is rejected.

Second case. Null hypothesis H0: M(X)=M(Y). Competing hypothesis H1: M(X)>M(Y).

In practice, such a case occurs if professional considerations suggest that the general mean of one population is greater than the general mean of another. For example, if a technological process improvement is introduced, then it is natural to assume that it will lead to an increase in product output.

In this case, a right-sided critical region is constructed based on the requirement that the probability of a criterion falling into this region, assuming the null hypothesis is true, is equal to the accepted significance level:

P(Z> zcr)=a. (3)

Let's show how to find the critical point using the Laplace function. Let's use the relation

P(0 zcr)=1/2.

By virtue of (2) and (3), we have Ф(zкр)+a=1/2. Therefore, Ф(zкр)=(1-2a)/2.

From here we conclude that in order to find the boundary of the right-hand critical region (zcr), it is enough to find the value of the Laplace function equal to (1-2a)/2. Then the right-hand critical region is determined by the inequality Z > zcr, and the region where the null hypothesis is accepted is determined by the inequality Z< zкр.

Rule.

1. Calculate the observed value of the criterion zob.

2. Using the table of the Laplace function, find the critical point from the equality Ф(zкр)=(1-2a)/2.

3. If Z obs.< z кр – нет оснований отвергнуть нулевую гипотезу. Если Z набл >z cr – we reject the null hypothesis.

Third case. Null hypothesis H0: M(X)=M(Y). Competing hypothesis H1: M(X)

In this case, a left-side critical region is constructed based on the requirement, the probability of the criterion falling into this region is presupposed

the validity of the null hypothesis, was equal to the accepted significance level P(Z< z’кр)=a, т.е. z’кр= – zкр. Таким образом, для того чтобы найти точку z’кр, достаточно сначала найти “вспомогательную точку” zкр а затем взять найденное значение со знаком минус. Тогда левосторонняя критическая область определяется неравенством Z < -zкр, а область принятия нулевой гипотезы – неравенством Z >-zcr.

Rule.

1. Calculate Zob.

2. Using the table of the Laplace function, find the “auxiliary point” zcr by the equality Ф(zcr)=(1-2a)/2, and then put z’cr = -zcr.

3. If Zob > -zcr, there is no reason to reject the null hypothesis.

If Zobserved< -zкр, – нулевую гипотезу отвергают.

Random articles

Up