Research Methods Supercourse (mirror of BA Super Help Desk ("methods door"))
Introduction Questions and Answers Examples
The BA Super Help desk is designed to answer research design and statistical questions to help you publish your research. If you have a question you can submit it to "BA Supercourse Help Desk" email@example.com
We provide background materials for you to review to help answer your questions. BA Super Help Desk Introductory Lectures: The Help Desk provides lectures for 12 of the major areas to which there will be questions for the help desk. Before you contact the help desk you may consider reviewing any of these valuable lectures. In addition, if you are trying to decide what statistical test to use, we would appreciate if you review web page called Which Test http://whichtest.info/index.html.
If you would like additional knowledge you can visit the Main Help Desk Web Pages
The Mission of the help desk is to answer specific research design questions to help you to publish. Our role will not be to teach statistics in general or to help in English translations but only to help with the research methods. We want to help you bring your research design and statistics up to a level of publishable quality. This is a free service. Please tell researchers who are having problems in publishing about our efforts.
Ask a question - "BA Supercourse Help Desk" "BASupercourse Help Desk"
We are piloting the Help Desk in the Eastern Mediterranean countries before helping all countries having limited publications.
Answer a question - "BA Supercourse Help Desk" "BASupercourse Help Desk" Please include a particular question to your E-mail together with your Answer to this question. Thank you for your Answer! Direct E-mail address for questions and answers is firstname.lastname@example.org
|Q1 by Nabil D Sulaiman, May 19, 2013||What is the best sampling frame for a national diabetes prevalence study in the absence of updated GHS sample?||
A1 by Mohamed E. Salem, May 25, 2013
In the absence of General Household Survey (GHS) sample, the most accurate and easy alternative is to use the readily available country geographical information to create your own national representative sample. The country map including its geographical information will be used as the sample frame. Geographical Information System (GIS) applications will help you to divide your map into representative clusters of households. ArcGIS is one of the application that could be used to do this exercise.
Here is a youtube link for ArcGIS tutorial
Here is an alternative youtube link (in Arabic)
the sound is not very good
The program will enable you to use your country map as a sample frame. Using the geographical information available on the map, you can select the clusters of households. Within each cluster you can draw a systematic random sample using walk through method inside the selected study areas.
A1 by Nabil D Sulaiman, May 26, 2013
Various sampling approaches could be explored to seek the best possible sampling frame for a rapidly changing expatriate population in the Gulf region, which are, The National Census (GHS), Water and Electricity Register, Telephone register and National ID.
Based on representativeness and feasibility,
we in UAE have adopted a novel sampling methodology, which involved systematic random sampling through Preventive Medicine Departments (PMDs), where all expatriate adults in the UAE are legally required to attend every 2-3 years to renew their residency visa. The PMD is a single place where recruiters, interviewers, nurses and phlebotomists are available. All staff working were nominated and trained for the study. Blood samples for both the study and the visa renewal, were collected at the same visit.
|Q2 by Abdelrahim Mutwakel Gaffar, May 19, 2013|| I am
conducting an evaluation study for a project
using "a pre- intervention - post -intervention"
What is the best statistical test to examine the change due to the intervention.
by Sami AR AL-Dubai, June 6, 2013
Someone can use SPSS. The statistical test depends on the type of the dependent variable someone want to test. If the dependent variable is normally distributed then you can use ''paired sample T-Test''. If it is not normally distributed, then you can use the alternative non-parametric test '' 2 related samples'' (Wilcoxon Signed Ranks).
A2 by Shacara Johnson, June 5, 2013
This research would be considered a repeated measures study design or paired design, in which you are interested in observing an intervention change in the same group of subjects. The statistical test to use is called a paired t-test so that you can determine whether a difference exist (finding the mean and standard deviation of the differences between the before- and after- measurements) and then the t-distribution for the single mean is used to analyze the difference (at the significance level).
A2 by Mohammad Babaeeian moghaddam,
June 5, 2013
If If we have a repeated measures design with pre-post measure
and the variables are distributed normally
(As assessed by SPSS, we can use a paired t test analysis.
If the data are not distributed normally,
we can use a Wilcoxson non-parametric test.
These two tests can be found in all standard statistical
testing procedures (SPSS, SAS and others).
A2 by Nicolas Padilla, June 6, 2013
If the variable is quantitative, the mean of differences (measure 1 - measure2) then the mean of differences and then t Student (short sample size less than 50) or Z(sample size more than 50).
A2 by Rami H. AL Rifai, June 9, 2013
The idea is that "Tell what the type of the dependent variable you have to tell you what type of test you could use".
There are 2 main types of variables:
1- Continuous like measuring blood pressure.
2- Discreet: like Yes or No variables.
Those variables could be dichotomous (two subcategory) or dichotomous (three subcategories) or even more than three subcategories.
However, in pre- post intervention studies, the testes to be used if your intervention was carried on the same groups are different from those if your intervention was carried out on different groups like control and intervention groups.
For example, if your dependent variable is continuous, and your intervention was on the same group, you have to use Paired T-test to detect if there was a difference due to intervention or not. But before that you have test the assumption of normality distribution for the dependent variable otherwise you have to use the non-parametric test.
A2 by Jay M. Fleisher, June 10, 2013
If your data is continuous and Normally distributed you can use a paired
t-test Procedure. This would apply if you can create an index over all
If you data is Categorical you can use McNeamars Test. This would be
applicable if you are looking at individual questions
Remember your data are pared.
A2 by Mohamed E. Salem, June13, 2013
If you want to measure the impact of an intervention, the simplest test are
1) The paired sample T-test; in case your indicator is measured quantitatively (blood sugar level, BMI, etc..).
See this link on how to perform paired sample T-test using SPSS http://www.youtube.com/watch?v=MJGk2sg4EZU
2) the McNemar test; If your indicator is measured qualitatively (diseased or not, complicated or not, etc…)
See this link on how to use McNemar
My suggestion to you is to improve your study design (pre-post) intervention design is a weak design if you want to relate the change (improvement) to your intervention. Including a control group will give more strength to your results. Random assignment of the cases and controls to your intervention will make your study even stronger and blindness will be perfect if it is applicable.
I am attaching a link to study designs
In case you operate one of the above designs the analysis should be more in-depth using, double difference analysis, regression model and non-equivalent group design in case of quasi-experimental designs
Q3 by Saad Tai, June 2, 2013
|I am interested to conduct a KAP study on HIV in Pakistan among medical doctors. My question is from where I can get questionnaire or how can I make self made questionnaire.||
by Deena Alasfoor, June 7, 2013
Guideline on how to do a KAP study is on the following link:
The research questions depend on the context and how you want to use the information; in general; It is important that each knowledge question is followed up with an attitude and a practice question that helps you in the course of action/intervention . As a researcher you need to identify your questions, based on the context and use the KAP method to explore these.
I hope this is helpful.
A3 by Bruce G. Weniger, June 8, 2013
Search the medical literature for well-written, high-quality reports in competitive journals for studies with similar research questions and methods you wish to employ. Many journals are already making available questionnaires, protocols, and other study-related documents by optional online download of “supplementary” material for published “printed” reports. For example, the supplementary material to this study (http://dx.doi.org/10.1056/
If the questionnaire is not thus posted, then you can email or write to the paper’s author(s), explaining that you would like to perform a similar study in your own population, using similar questionnaire for comparability. Ask if the author(s) will provide you the questionnaire to adapt for your own study. Offer to acknowledge their assistance in your future paper, and to cite their work if relevant to what you eventually perform and find.
A3 by Shacara Johnson, June 8, 2013
You can access information on KAP (Knowledge, Attitudes, and Practice) survey instruments using the World Health Organization’s website or conduct an internet search for HIV KAP surveys in Pakistan. You might also want to search publications by fellow researchers who conducted similar research among HIV care providers in Pakistan and contact them about collaborating or asking for permission to utilize their instrument for your work.
There are several references to HIV KAP
instruments pertaining to the Eastern
Mediterranean region (which would include
Pakistan) stemming from topics ranging from
conducting behavioral surveillance of risk
factors to country-level results of KAP
implementation. If you cannot identify a current
instrument being used in Pakistan, then you
might seek to search for other KAP instruments
used in a similar setting for which you can
modify for what you desire to examine in
Pakistan among health care providers. Two
sources as examples from the WHO site that may
be of interest to provide points of
consideration while constructing your instrument
|Q4 by Murtada Osman, June 11, 2013||How to select the journal for publication?||
by Deena Alasfoor, June 14, 2013
Selecting a suitable journal for publication is one of the most difficult tasks of researchers. The impact factor; interest of the journal and the value of your manuscript; as well as your experience in publishing all count for this. Obviously, you would want to aim at the journals with the highest impact factors. However, it is very hard to publish in these unless your manuscript is of great importance; and you have collaborators who have published in that area earlier. First, select the journals that might interest you; probably these will be the ones you refer to them. If your publication is context free; then you might have a better chance in publishing in an international high impact factor. If your manuscript for example presents national survey results you may want to go to a national or regional journal. Once you have read the authors guide carefully, be sure that your paper matches the journals subjects of interest; then if you have a number of these you could try for the highest impact factor first, and then if rejected go to the lower one until your paper is accepted. This could happen at the first time; but could also take some attempts before getting a journal that accepts your publication. Sometimes the topic had been discussed enough, and the authors do not see that your publication adds a new thing to the existing knowledge, do not get disappointed, keep trying. Good Luck
A4 by Eugene Shubnikov, June 11, 2013
I will recommend you to study Supercourse lecture http://www.pitt.edu/~
Hanan Abdulghafur Khalil (thought face book), June 23, 2013
May I ask about the required sample size for pilot study and pretesting andwhether the results should be mentioned briefly after the study completed?
A5 by Eman
Eltahlawy (thought face book), June 23. 2013
Thanks Dr Hanan for your question. It depends on your needs - to test the language of questionnaire, the methodology and logistics inside the field and to ensure that time needed for questionnaire.
A5 by Eugene Shubnikov, June 23. 2013
Dear Hanan, I recommend you to study lectures from Introductory page http://ssc.bibalex.org/helpdesk/introduction.jsf, especially "Sample size and Statistical power Lecture". Thank you for Question!
A5 by Fatma Hassan, June 23, 2013 (Facebook)
Baker (1994) found that a sample size of 10-20% of the sample size for the actual study is a reasonable number of participants to consider enrolling in a pilot study. Another rule of the thumb is to take 30 patients or greater to estimate a parameter (Browne, 1999). Yes the results of pilot should be reported, better in the methodology section. The details of any modifications in the questionnaire based on pilot should be reported.
A5 by Nicolas Padilla , June 23, 2013 (Facebook)
In a pilot study you need about 10-20% of the sample size needed for the larger study
A5 by Jay Fleisher, June 23, 2013 (Facebook)
Sample Size calculations basically deal with the difference you expect to see and the probability you wish this difference will occur ( Alpha). There are many Sample Size calculators of the Web for free.
Andrey Kuznetzov (thought face book), July 08, 2013
|Is there any standard R function for calculation of a variance of probability distribution (not sample variance)? Thanks in advance.||
A6 by Jay Fleisher,
July 21, 2013
I think the question pertains to the software package R. There are many
distributions besides the Normal Distribution.
I think the question is how to find the variance using R of a certain
I attach a brief description of what I think this question means.
|Q7 by Mohammad Babaeeian moghaddam, July 19, 2013||Animal bites are an important problem in my city and I want to investigate that. How can I design the study (what study design I can use)? Any questionnaires are available for such studies?||
A7 by Nicolas
July 20, 2013 (Facebook)
First, are you meaning animals as dogs, for example? If you want to know the burden of animals bites it is better to use a cross-sectional design, if you want to know the risk factors for animals bites, it is better to use cases-controls design.
|Q8 by Shatabdi Goon, August 28,2013||If I lost my data from survey, which was used in SPSS program, how I will be able to find out the statistical analysis without having those data(want to evaluate p value). Is it possible to get the p value from the direct result?||
A8 by Nicolas
Padilla, August 29, 2013
You can use epidat (statistical software for free from Xunta de Galicia and PAHO) using tabulated data, for example.
A8 by Eman Eltahlawy, September 4, 2013
You can use Epicalac 2000 for tabulated data to evaluate the p value , this easy program and free in the net
A8 by Mohamed E. Salem, September 5, 2013
You can use spss and organise ur data for 2*2 table as 0 1, 1 1, 0 0, 1 0 and put the count for each category as a third column . Then go to data weight and weight your data by the count column
|Q9 by Nagah Selim, September 23,2013||
Javier Muñiz, September 23, 2013
1.- Your study aims at estimating a proportion in the population.
2.- You have to consider:
a.- How do I select the participants?: Sampling procedure
b.- How many participants should I select: Sample size (related to “a” to some extent).
3.- Assuming simple random sampling in “a”, the sample size depends on:
a.- Size of the population to which you want to infere the proportion that you will find in your study (sample). Surprisingly, it is not very important (unless very small populations).
b.- Any idea of what you expect to find? (48%, 38% and 47% in your case). 50% is the most demanding assumption (the one that will result in a bigger sample size). Use 50% and you will be safe (your pre-study estimate is very close to 50%).
c.- What precision do I need? Or, how wide do I want the confidence interval of my estimate? A wide confidence interval is less precise than a narrow one. The narrower the confidence interval desired when presenting the results (better precision), the bigger the sample size.
Below find an output of a program that I use (EPIDAT, developed by Xunta de Galicia, Spain and O.P.S.)
Sample size and precision to estimate a population proportion
Size of the population: 20000
Expected proportion: 50,0%
Confidence level: 95,0%
Study design: 1,0
Precisión (%) Tamaño de muestra
What does this mean? For example, if you choose to aim at a precision of 2% (IT IS YOUR DECISION AS INVESTIGATOR) for the whole study, you will aim to include 2144 participants. At the end of the study you will be able to say: The prevalence of depression among school children is 50%, with a 95% confidence interval of 48-52% (maybe it is not exactly 50%, but you will be pretty sure that the proportion in the population is somewhere between 48% and 52%). When considering subgroups, your precision will decrease because of smaller sample sizes available (for example, you may have around 1000 boys and 1000 girls and the corresponding precision in these subgroups will be around 3%).
NOTE: We have assumed simple random sampling (not always feasible when studying kids in schools). If other design is chosen this may affect the sample size (bigger samples will be needed, at least in theory).
It is plenty of free programs available to compute the sample size for different study designs. I recommend you EPIDAT 3.1 because it also have some other very useful procedure for tabulated data (http://www.sergas.es/
A9 by Abu Zar, September 23, 2013
|Q10 by Nagah Selim, September 23,2013||
If I would like to study prevalence of a disease
among clients attending phcc and I have a rough
estimate on the monthly attendees, can I use
this number as total population for calculating
the sample size?
Jay Fleisher, September
If we assume alpha=0.05 and a margin of error = 5% the following sample sizes are
For a prevalence of 48%, n=384
This is for an un-stratified analysis because we don't have info for a difference for males vs females... One should add in about 20% for non-responders, if applicable
I have added a link ( see attachment) that explains how to do it and calculates the sample size for you. One can alternate alpha, size of the margin of error to get different estimates...
As for the second question, if I understand it, the answer is no one can't assume it is the whole population. What you have is a sample that goes to pcc. Thus the inference will be to the clinic
See a link:
|Q11 by Naresh Chauhan September 29,2013||What is the sampling technique to draw samples from urban slum to know their behaviour on particular health problems and service utilization?||
Jay Fleisher, October
The following steps should be taken to insure an unbiased sampling:
1. Define the Population you want to sample. The inference of any analysis will go to this population
2. Define your basic Measure of Effect. Are you going to sample Homes, Individuals, etc
3. When step 1 and 2 are completed RANDOMLY sample.
4. Conduct Analysis
|Q12 by Mary Mwangome, October 02,2013||
am planning to analyse a cohort database for the
effect of half dose of drug x prophylaxis on
deaths. Drug x is a prophylactic medication.
Jay Fleisher, October
There are sampling issues in this study design but I don’t think their fatal. If I understand the design, you have a situation where the patients act as their own controls with respect to dosage. Thus you have paired data. I would break the paring and run separate analyses on each dose. I would try Logistic regression on each dose. This would control for any covariates you have.
In other words you can use mortality as your outcome variable and dose1 + covariates for dose 1 for your Independent variables and do the same analysis separately for dose 2. Then compare the odds ratios for each dose along with 95% Confidence Intervals and p values that the Logistic regression will provide to you. If the Confidence Intervals overlap then Dose would have no effect. As for the 30% Lost to follow-up I think a 70% follow-up rate is acceptable. STATA, and SAS can do this easily. You would have to report the weaknesses in design of course. The “wash out” period is of concern since there was none. I would give it a try anyway. My opinion is that if your results show a clinically significant difference among the Higher dosage you have an answer. If they do not then you have another answer.
Faina Linkov, October
Main points for the answer:
1. Loss to follow up of 30% in 6 years is very good and typical for studies of this caliber.
2. Survival analysis might provide good approach for some of the data analysis.
3. Stata and sas can both do the analysis.
|Q13 by Zafar Fatmi,December 05,2013||
I am trying to analyze the time-series data for
Air pollution and cardiovascular diseases. I
want to use Generalized Additive Model (GAM) for
analysis. I am unable to find any help in this
I have to adjust for weather variables and age and gender.
Please provide some guideline and help.
Mohammad Babaeeian moghaddam,December 05, 2013
See Question 7 and Answer 7 first.
|Animal bite(dog bite or petty - home dog) is an important problem in my city and I want investigate causes(factors that make animal angry and then animals attack their owners. How can I design the study(what study design I can use)? Is the questionary available for this study?|