Hello All
I am running a linear regression model where I have some missing covariates that are continuous (CSF_biomarker data). For those that are missing I have labeled them as NaNs (eg: 3 1 5 NaN 6 NaN) I was running my analysis as a linear regression with AllSubjects (so basically AllSubjects and CSF_bio, [0,1]). However I was wondering what difference does it make to include AllSubjects or should I include only those in the analysis that I have CSF data on?
For example should I create a new variable (Usable_Scans) and only highlight those scans that have available covariate data (CSF_biomarker data)? In other words my covariates would look like this:
Usable_Scans: 1 1 1 NaN 1 NaN
CSF_Bio: 3 1 5 NaN 6 NaN
and then run the data (Usable_Scans, CSF_Bio [0 1])
What are the differences here if any in the way the statistics are handled? Thank you in advance for your help
I am running a linear regression model where I have some missing covariates that are continuous (CSF_biomarker data). For those that are missing I have labeled them as NaNs (eg: 3 1 5 NaN 6 NaN) I was running my analysis as a linear regression with AllSubjects (so basically AllSubjects and CSF_bio, [0,1]). However I was wondering what difference does it make to include AllSubjects or should I include only those in the analysis that I have CSF data on?
For example should I create a new variable (Usable_Scans) and only highlight those scans that have available covariate data (CSF_biomarker data)? In other words my covariates would look like this:
Usable_Scans: 1 1 1 NaN 1 NaN
CSF_Bio: 3 1 5 NaN 6 NaN
and then run the data (Usable_Scans, CSF_Bio [0 1])
What are the differences here if any in the way the statistics are handled? Thank you in advance for your help