Wednesday, November 23, 2022

Model-driven data analytics (MDDA): Resources for teachers and students

The technique of model-driven data analytics (MDDA) involves the creation of a path model expressing an applied theory, and testing the model using path analysis with latent variables. The latter, path analysis with latent variables, is generally known as structural equation modeling (SEM).

MDDA emerged from the work of a special category of users of the software WarpPLS – data analysis consultants, who regularly work with organizations to provide data-driven recommendations.

While MDDA can be implemented through a variety of software tools, it has found wide adoption among WarpPLS users, because of the many powerful features of this software that can be used in this context. Moreover, in WarpPLS all analyses are model-driven, which makes this software much more user-friendly than other software tools that rely on extensive scripting to conduct analyses.

The website linked below provides several resources for teachers and students, including: a textbook, which may be used by teachers of university courses on MDDA, as a free online document; datasets, which include not only data; but also scenarios, questions, and variables (these are used to illustrate how MDDA can be used to address the needs of various organizations), and YouTube videos, which provide step-by-step illustrations of how to analyze data in the context of scenarios, questions, and variables.

Saturday, March 12, 2022

WarpPLS 8.0 upgraded to stable: Logistic regression, full latent growth graphs, HTMT2 ratios, and more!

Dear colleagues:

Version 8.0 of WarpPLS is now available as a stable version.

Please use the link below to go to the WarpPLS blog post describing this version’s new features.

You can download and install it for a free trial of from:

Users of previous versions can use the same license information that they already have; it will work with this new version for the remainder of their license periods.

Best regards to all!

Saturday, January 22, 2022

WarpPLS 8.0 beta now available: Logistic regression, full latent growth graphs, HTMT2 ratios, and more!

Version 8.0 of WarpPLS is now available, as a beta version. You can download and install it for a free trial from:

Each new version of the software incorporates features that aim at achieving an important end goal: to allow users to employ SEM to conduct any of the major statistical tests; from relatively simple tests such as comparisons of means, to more sophisticated ones such as nonlinear SEM tests employing logistic regression. Among the community of users of this software, there are very sophisticated SEM experts that constantly challenge us to implement new data analysis features, as well as to make the existing features as easy to use as possible. Because of the constant input from our users, including those who are very knowledgeable about SEM, the software now arguably provides the most extensive set of features of any SEM software. We hope to continue in this path as the SEM field evolves. Below we outline new features added to the current version of the software.

Logistic regression variables. The menu option “Explore logistic regression” now allows you to create a logistic regression variable as a new indicator that has both unstandardized and standardized values. Logistic regression is normally used to convert an endogenous variable on a non-ratio scale (e.g., dichotomous) into a variable reflecting probabilities. You need to choose the variable to be converted, which should be an endogenous variable, and its predictors. The new logistic regression variable is meant to be used as a replacement for the endogenous variable on which it is based. Two algorithms are available: probit and logit. The former is recommended for dichotomous variables; the latter for non-ratio variables where the number of different values (a.k.a. “distinct observations”) is greater than 2 but still significantly smaller than the sample size; e.g., 10 different values over a sample size of 100. The unstandardized values of a logistic regression variable are probabilities; going from 0 to 1. Since a logistic regression variable can be severely collinear with its predictors, you can set a local full collinearity VIF cap for the logistic regression variable. Predictor-criterion collinearity, or lateral collinearity (Kock & Lynn, 2012), is rarely assessed or controlled in classic logistic regression algorithms.

Absolute and relative variation measures. You can now view the number of different values (a.k.a. “distinct observations”) for all indicators and latent variables, as well as the ratio between the number of different values and sample size. The first is an absolute and the second a relative variation measure. These are available under the menu options “View or save correlations and descriptive statistics for indicators” and “View latent variable coefficients”, respectively. These measures can help inform decisions about whether to use logistic regression, particularly in connection with endogenous latent variables. If the number of different values is significantly smaller than the sample size (e.g., 10 different values over a sample size of 100) for an endogenous latent variable, that means that a new logistic regression variable could be created and used as a replacement for the endogenous variable. If several predictors are available, the new logistic regression variable will incorporate more variation than the endogenous variable on which it is based, which will typically be reflected in larger coefficients of association (e.g., path coefficients) when the logistic regression variable is used in the model.

Graphs for full latent growth coefficients. You can now view several graphs for each of the full latent growth coefficients provided under the menu option “Explore full latent growth”. Full latent growth coefficients have a number of applications, such as: moderating effects analyses, nonlinearity tests, multi-group and measurement invariance tests, and the assessment of moderated mediation effects. Each of the graphs is made up of several plots, which refer to changes in the coefficients selected (e.g., path coefficients) for the relationship between the variables shown in the X and Y axes, as the latent growth variable goes from low to high. The following graph menu options are available: “Full sample splits (megaphones)”, “Partial sub-samples splits (megaphones)”, “Full sample splits (bars)”, “Partial sub-samples splits (bars)”, “Full sample splits (lines)”, and “Partial sub-samples splits (lines)”.

HTMT2 ratios. The sub-option “'Discriminant validity coefficients (extended set)”, under the menu option “Explore additional coefficients and indices”, now allows you to inspect the newest version of the set of heterotrait-monotrait (HTMT) ratios calculated by the software. These have been dubbed HTMT2 ratios. The HTMT and HTMT2 ratios have been proposed for discriminant validity assessment, particularly in the context of composite-based SEM via classic PLS algorithms; as opposed to factor-based SEM via modern algorithms that estimate factors (which have been available from this software for quite some time now). Discriminant validity is a measure of the quality of a measurement instrument; the instrument itself is typically a set of question-statements. A measurement instrument has good discriminant validity if the question-statements (or other measures) associated with each latent variable are not confused by the respondents, in terms of their meaning, with the question-statements associated with other latent variables.

Incremental interface improvement. This is conducted in each new version of the software. At several points the code has been modified so that the user interface experiences are improved. This has led in several cases to what appears to be a smoother flow through the several steps and procedures guided by the user interface. Several elements of the graphical user interface, such as screens and warning messages, have been optimized so that users can perform SEM analysis tasks with only a few clicks – and in a straightforward fashion. Nevertheless, care is always taken to ensure that the user interfaces do not change too much, otherwise users would have to re-learn how to use the interface whenever a new version is released.

Incremental code optimization. This is also conducted in each new version of the software. At several points the code has been optimized for speed, stability, and coefficient estimation precision. In some cases, the optimization has led to lesser propagation of sampling error, making the software reach accurate results at lower sample sizes – that is, increasing the statistical efficiency of the software. These incremental code optimization changes have led to incremental gains in speed even as new features have been added. More often than not, new features require additional computational steps and often complex calculations, mostly to generate internal checks and coefficients that were not available before.

Take a look at the following videos, which have been created using this new version. They illustrate the new features outlined above.

Explore Logistic Regression in WarpPLS

View the Number of Different Values for Variables in WarpPLS

View Full Latent Growth Graphs in WarpPLS

Holistic Measurement Model Assessment in SEM with WarpPLS

Reduce Common Structural Variation with WarpPLS

Assessing Multiple Reciprocal Relationships in SEM with WarpPLS

Choose the Correlation Signs for Anchor Variables in a Multilevel Analysis with WarpPLS

Using Logistic Regression in PLS-SEM with Composites and Factors

Conducting a What-if Analysis in PLS-SEM with Analytic Composites