clustering

snafu.clustering.clusterSize(fluency_lists, scheme, clustertype='fluid')[source]

Calculate average cluster size of a fluency list (or list of fluency lists).

This function expects a list of lists. If you want to calculate the average cluster size of a single list, you can wrap it in another list, e.g., [fluency_list]

Parameters:

fluency_lists (list) – A list of fluency lists, e.g., fluencydata.labeledlists
scheme (str or int) – For semantic fluency data, specify a path indicating clustering scheme (.csv) to use. For letter fluency data, specify an in integer indicating the number of initial letters to use as clusters (e.g., 2)
clustertype (str, optional) – Type of clustering to apply. Default is ‘fluid’. The other option is ‘static’.

Returns:

A list containing the average cluster size in each fluency list.

Return type:

list of float

snafu.clustering.clusterSwitch(fluency_lists, scheme, clustertype='fluid', switchrate=False)[source]

Calculate the number of cluster switches in a fluency list (or list of fluency lists. Alternatively, calculate the switch rate (number of switches divided by list length).

This function expects a list of lists. If you want to calculate the number of cluster switches in a single list, you can wrap it in another list, e.g., [fluency_list]

Parameters:

fluency_lists (list) – A list of fluency lists, e.g., fluencydata.labeledlists
scheme (str or int) – For semantic fluency data, specify a path indicating clustering scheme (.csv) to use. For letter fluency data, specify an in integer indicating the number of initial letters to use as clusters (e.g., 2)
clustertype (str, optional) – Type of clustering to apply. Default is ‘fluid’. The other option is ‘static’.
switchrate (bool, optional) – If True, returns the switch rate instead of switch count. Default is False.

Returns:

A list containing the number of switches in each fluency list.

Return type:

list of float

snafu.clustering.findClusters(fluency_lists, scheme, clustertype='fluid')[source]

Calculate the size of each cluster in a fluency list (or list of fluency lists) and return these cluster sizes as a list. For example, [‘dog’, ‘cat’, ‘whale’, ‘shark’] might return [2, 2], as there are two clusters of size 2.

This function is used internally by snafu.clusterSize and snafu.clusterSwitch.

Parameters:

fluency_lists (list) – A list of fluency lists, e.g., fluencydata.labeledlists
scheme (str or int) – For semantic fluency data, specify a path indicating clustering scheme (.csv) to use. For letter fluency data, specify an in integer indicating the number of initial letters to use as clusters (e.g., 2)
clustertype (str, optional) – Type of clustering to apply. Default is ‘fluid’. The other option is ‘static’.

Returns:

A list of cluster sizes (or nested list of cluster sizes).

Return type:

list

snafu.clustering.labelClusters(fluency_lists, scheme, labelIntrusions=False, targetLetter=None)[source]

Replace each item in a fluency list (or list of fluency lists) with its category or categories. For example, [‘dog’, ‘cat’, ‘whale’, ‘shark’] might return [‘canine;pets’, ‘pets’, ‘fish;water’, ‘fish;water’].

This function is used internally by snafu.findClusters.

Parameters:

fluency_lists (list) – A list of fluency lists, e.g., fluencydata.labeledlists
scheme (str or int) – For semantic fluency data, specify a path indicating clustering scheme (.csv) to use. For letter fluency data, specify an in integer indicating the number of initial letters to use as clusters (e.g., 2)
labelIntrusions (bool, optional) – When False, intrusions are silently omitted (as if they do not exist). When True, intrusions are replaced with the pseudo-category label ‘intrusion’. Default is False.
targetLetter (str, optional) – For letter fluency data, identifies the target letter. This is necessary only to identify intrusions (when labelIntrusions is set to True), otherwise it has no effect. Default is None.

Returns:

A list (or nested list) of categoriesed corresponding to each item.

Return type:

list