version 3.22

Perform CNA - coincidence analysis using QCA

Description

This function mimics the functionality in the package cna, finding all possible necessary and sufficient solutions for all possible outcomes in a specific dataset.

Usage

causalChain(data, ordering = NULL, strict = FALSE, pi.cons = 0, pi.depth = 0, sol.cons = 0, sol.cov = 1, sol.depth = 0, ...)

Arguments

data A data frame containing calibrated causal conditions.
ordering A character string, or a list of character vectors specifying the causal ordering of the causal conditions.
strict Logical, prevents causal conditions on the same temporal level to act as outcomes for each other.
pi.cons Numerical fuzzy value between 0 and 1, minimal consistency threshold for a prime implicant to be declared as sufficient.
pi.depth Integer, a maximum number of causal conditions to be used when searching for conjunctive prime implicants.
sol.cons Numerical fuzzy value between 0 and 1, minimal consistency threshold for a model to be declared as sufficient.
sol.cov Numerical fuzzy value between 0 and 1, minimal coverage threshold for a model to be declared as necessary.
sol.depth Integer, a maximum number of prime implicants to be used when searching for disjunctive solutions.
... Other arguments to be passed to functions minimize() and truthTable().

Details

Although claiming to be a novel technique, coincidence analysis is yet another form of Boolean minimization. What it does is very similar and results in the same set of solutions as performing separate QCA analyses where every causal condition from the data is considered an outcome.

This function aims to demonstrate this affirmation and show that results from package cna can be obtained with package QCA. It is not intended to offer a complete replacement for the function cna(), but only to replicate its so called "asf" - atomic solution formulas.

The three most important arguments from function cna() have direct correspondents in function minimize():

con corresponds to sol.cons
con.msc corresponds to pi.cons
cov corresponds to sol.cov

Two other arguments from function cna() have been directly imported in this function, to complete the list of arguments that generate the same results.

The argument ordering splits the causal conditions in different temporal levels, where prior arguments can act as causal conditions, but not as outcomes for the subsequent temporal conditions. One simple way to split conditions is to use a list object, where different components act as different temporal levels, in the order of their index in the list: conditions from the first component act as the oldest causal factors, while those from the and the last component are part of the most recent temporal level.

Another, perhaps simpler way to express the same thing is to use a single character, where factors on the same level are separated with a comma, and temporal levels are separated by the sign <.

A possible example is: "A, B, C < D, E < F".

Here, there are three temporal levels and conditions A, B and C can act as causal factors for the conditions D, E and F, while the reverse is not possible. Given that D, E and F happen in a subsequent temporal levels, they cannot act as causal conditions for A, B or C. The same thing is valid with D and E, which can act as causal conditions for F, whereas F cannot act as a causal condition for D or E, and certainly not for A, B or C.

The argument strict controls whether causal conditions from the same temporal level may be outcomes for each other. If activated, none of A, B and C can act as causal conditions for the other two, and the same thing happens in the next temporal level where neither D nor E can be causally related to each other.

Although the two functions reach the same results, they follow different methods. The input for the minimization behind the function cna() is a coincidence list, while in package cna the input for the minimization procedure is a truth table. The difference is subtle but important, with the most important difference that package cna is not exhaustive.

To find a set of solutions in a reasonable time, the formal choice in package cna is to deliberately stop the search at certain (default) depths of complexity. Users are free to experiment with these depths from the argument maxstep, but there is no guarantee the results will be exhaustive.

On the other hand, the function causalChain() and generally all related functions from package QCA are spending more time to make sure the search is exhaustive. Depths can be set via the arguments pi.depth and sol.depth from the arguments in function minimize(), but unlike package cna these are not mandatory.

By default, the package QCA employes a different search algorithm based on Consistency Cubes (Dusa, 2017), analysing all possible combinations of causal conditions and all possible combinations of their respective levels. The structure of the input dataset (number of causal conditions, number of levels, number of unique rows in the truth table) has a direct implication on the search time, as all of those characteristics become entry parameters when calculating all possible combinations.

Consequently, two kinds of depth arguments are provided:

pi.depth the maximum number of causal conditions needed to construct a prime implicant, the complexity level where the search can be stopped, as long as the PI chart can be solved.
sol.depth the maximum number of prime implicants needed to find a solution (to cover all initial positive output configurations)

These arguments introduce a possible new way of deriving prime implicants and solutions, that can lead to different results (i.e. even more parsimonious) compared to the classical Quine-McCluskey. When either of them is modified from the default value of 0, the minimization method is automatically set to "CCubes" and the remainders are automatically included in the minimization.

The higher these depths, the higher the search time. Connversely, the search time can be significantly shorter if these depths are smaller. Irrespective of how large pi.depth is, the algorithm will always stop at a maximum complexity level where no new, non-redundant prime implicants are found. The argument sol.depth is relevant only when activating the argument all.sol to solve the PI chart.

Exhaustiveness is guaranteed in package QCA precisely because it uses a truth table as an input for the minimization procedure. The only exception is the option of finding solutions based on their consistency, with the argument sol.cons: for large PI charts, time can quickly increase to infinity. If not otherwise specified in the argument sol.depth the function causalChain() silently sets a complexity level of 5 prime implicants per solution.

Value

A list of length equal to the number of columns in the data. Each component contains the result of the QCA minimization for that specific column acting as an outcome.

Examples

# The following examples assume the package cna is installed library(cna) cna(d.educate, what = "a")
--- Coincidence Analysis (CNA) --- Factors: U, D, L, G, E Atomic solution formulas: ------------------------- Outcome E: solution consistency coverage complexity L + G <-> E 1 1 2 U + D + G <-> E 1 1 3 Outcome L: solution consistency coverage complexity U + D <-> L 1 1 2
# same results with cc <- causalChain(d.educate) cc
M1: U + D <=> L M1: L + G <=> E M2: U + D + G <=> E
# inclusion and coverage scores can be inspected for each outcome cc$E$IC
------------------- inclS PRI covS covU (M1) (M2) ----------------------------------------------- 1 G 1.000 1.000 0.571 0.143 0.143 0.143 ----------------------------------------------- 2 U 1.000 1.000 0.571 0.000 0.143 3 D 1.000 1.000 0.571 0.000 0.143 4 L 1.000 1.000 0.857 0.000 0.429 ----------------------------------------------- M1 1.000 1.000 1.000 M2 1.000 1.000 1.000
# another example, function cna() requires specific complexity depths cna(d.women, maxstep = c(3, 4, 9), what = "a")
--- Coincidence Analysis (CNA) --- Factors: ES, QU, WS, WM, LP, WNP Atomic solution formulas: ------------------------- Outcome WNP: solution consistency coverage complexity WS + ES*WM + es*LP + QU*LP <-> WNP 1 1 7 WS + ES*WM + QU*LP + WM*LP <-> WNP 1 1 7
# same results with, no specific depths are required causalChain(d.women)
M1: WS + ~ES*LP + ES*WM + QU*LP <=> WNP M2: WS + ES*WM + QU*LP + WM*LP <=> WNP
# multivalue data require a different function in package cna mvcna(d.pban, ordering = list(c("C", "F", "T", "V"), "PB"), cov = 0.95, maxstep = c(6, 6, 10), what = "a")
--- Coincidence Analysis (CNA) --- Causal ordering: C, F, T, V < PB Atomic solution formulas: ------------------------- Outcome PB=1: solution consistency coverage complexity C=1 + F=2 + C=0*F=1 + C=2*V=0 <-> PB=1 1 0.952 6 C=1 + F=2 + C=0*T=2 + C=2*V=0 <-> PB=1 1 0.952 6 C=1 + F=2 + C=2*F=0 + C=0*F=1 + F=1*V=0 <-> PB=1 1 0.952 8 C=1 + F=2 + C=2*F=0 + C=0*T=2 + F=1*V=0 <-> PB=1 1 0.952 8 C=1 + F=2 + C=0*F=1 + C=2*T=1 + T=2*V=0 <-> PB=1 1 0.952 8 ... (total no. of formulas: 14)
# same results again, simpler command causalChain(d.pban, ordering = "C, F, T, V < PB", sol.cov = 0.95)
M01: C{1} + F{2} + C{0}*F{1} + C{2}*V{0} <=> PB{1} M02: C{1} + F{2} + C{0}*T{2} + C{2}*V{0} <=> PB{1} M03: C{1} + F{2} + C{0}*F{1} + C{2}*F{0} + F{1}*V{0} <=> PB{1} M04: C{1} + F{2} + C{0}*F{1} + C{2}*T{1} + T{2}*V{0} <=> PB{1} M05: C{1} + F{2} + C{0}*F{1} + T{1}*V{0} + T{2}*V{0} <=> PB{1} M06: C{1} + F{2} + C{0}*T{2} + C{2}*F{0} + F{1}*V{0} <=> PB{1} M07: C{1} + F{2} + C{0}*T{2} + C{2}*T{1} + T{2}*V{0} <=> PB{1} M08: C{1} + F{2} + C{0}*T{2} + T{1}*V{0} + T{2}*V{0} <=> PB{1} M09: C{1} + F{2} + C{0}*F{1} + C{2}*F{0} + F{1}*T{1} + T{2}*V{0} <=> PB{1} M10: C{1} + F{2} + C{0}*F{1} + C{2}*T{1} + F{0}*T{2} + F{1}*V{0} <=> PB{1} M11: C{1} + F{2} + C{0}*F{1} + F{0}*T{2} + F{1}*V{0} + T{1}*V{0} <=> PB{1} M12: C{1} + F{2} + C{0}*T{2} + C{2}*F{0} + F{1}*T{1} + T{2}*V{0} <=> PB{1} M13: C{1} + F{2} + C{0}*T{2} + C{2}*T{1} + F{0}*T{2} + F{1}*V{0} <=> PB{1} M14: C{1} + F{2} + C{0}*T{2} + F{0}*T{2} + F{1}*V{0} + T{1}*V{0} <=> PB{1}
# specifying a lower consistency threshold for the solutions mvcna(d.pban, ordering = list(c("C", "F", "T", "V"), "PB"), con = .93, maxstep = c(6, 6, 10), what = "a")
--- Coincidence Analysis (CNA) --- Causal ordering: C, F, T, V < PB Atomic solution formulas: ------------------------- Outcome PB=1: solution consistency coverage complexity C=1 + F=2 + T=2 + C=2*T=1 <-> PB=1 0.955 1 5 C=1 + F=2 + T=2 + C=2*F=0 + F=1*T=1 <-> PB=1 0.955 1 7
# same thing with causalChain(d.pban, ordering = "C, F, T, V < PB", pi.cons = 0.93, sol.cons = 0.95)
M1: C{1} + F{2} + T{2} + C{2}*T{1} <=> PB{1} M2: C{1} + F{2} + T{2} + C{2}*F{0} + F{1}*T{1} <=> PB{1}
# setting consistency thresholds for the PIs, solutions and also # a coverage threshold for the solution (note that an yet another # function for fuzzy sets is needed in package cna) dat2 <- d.autonomy[15:30, c("AU","RE", "CN", "DE")] fscna(dat2, ordering = list("AU"), con = .9, con.msc = .85, cov = .85, what = "a")
--- Coincidence Analysis (CNA) --- Causal ordering: RE, CN, DE < AU Atomic solution formulas: ------------------------- Outcome AU: solution consistency coverage complexity RE*cn + re*CN <-> AU 0.92 0.851 4 re*DE + cn*DE <-> AU 0.90 0.862 4
# again, the same results using the same function: causalChain(dat2, ordering = "AU", sol.cons = 0.9, pi.cons = 0.85, sol.cov = 0.85)
M1: ~RE*CN + RE*~CN <=> AU M2: ~RE*DE + ~CN*DE <=> AU

Author

Adrian Dusa

See also

minimize(), truthTable()