- #1
fog37
- 1,549
- 107
- TL;DR Summary
- Clearly understand the difference between categorical and numerical variables...
Hello,
I am generally clear on the distinction between numerical and nonnumerical (also called categorical or qualitative) variables but I still have some doubts in some regards.
A numerical variable (continuous or discrete) has a value that derives from a measurement procedure (using a tool) or from counting, I would say.
Obviously, height and weight are continuous numerical variables (we use tools to get their values). The number of passengers on a plane is discrete numerical variable even if we don't use a tool to determine that number. What we can do we numerical variables is math (calculate the mean, the mode, the median).
In regards to categorical variables, they are variables with a finite number of labels, they belong to a finite number of groups (2 or more). The labels are generally text but the labels can also be numbers (zip code, etc.) which don't really have a mathematical meaning, I would say.
Ordinal (nominal too?) categorical variables appear to be similar to discrete numerical variables because they have a finite number of values. What is the main difference then? What is the criterion to determine that? For categorical variables, we can calculate the frequency of a certain label/class and the only measure of central tendency is the mode (we cannot computer mean and median for nominal or ordinal qualitative variable)...
Here my dilemma: star rating (1 star to 5 star) is generally considered an ordinal categorical variables. However we could take all the ratings provided by 4 customers (4,3,4,1) and do compute the average star rating (4+3+4+1)/4 =3.
So can a variable be categorical and also numerical at the same time? I wouldn't think so. "Star rating" is categorical. But "average star rating" is a different variable and is numerical...Is that correct?
Thanks for any clarification.
I am generally clear on the distinction between numerical and nonnumerical (also called categorical or qualitative) variables but I still have some doubts in some regards.
A numerical variable (continuous or discrete) has a value that derives from a measurement procedure (using a tool) or from counting, I would say.
Obviously, height and weight are continuous numerical variables (we use tools to get their values). The number of passengers on a plane is discrete numerical variable even if we don't use a tool to determine that number. What we can do we numerical variables is math (calculate the mean, the mode, the median).
In regards to categorical variables, they are variables with a finite number of labels, they belong to a finite number of groups (2 or more). The labels are generally text but the labels can also be numbers (zip code, etc.) which don't really have a mathematical meaning, I would say.
Ordinal (nominal too?) categorical variables appear to be similar to discrete numerical variables because they have a finite number of values. What is the main difference then? What is the criterion to determine that? For categorical variables, we can calculate the frequency of a certain label/class and the only measure of central tendency is the mode (we cannot computer mean and median for nominal or ordinal qualitative variable)...
Here my dilemma: star rating (1 star to 5 star) is generally considered an ordinal categorical variables. However we could take all the ratings provided by 4 customers (4,3,4,1) and do compute the average star rating (4+3+4+1)/4 =3.
So can a variable be categorical and also numerical at the same time? I wouldn't think so. "Star rating" is categorical. But "average star rating" is a different variable and is numerical...Is that correct?
Thanks for any clarification.