Elements of Credibility in Educational Research

Identifying elements of credibility to evaluate assessment is relatively straightforward. Assessment, either formal or informal, is also a part of daily practice in the professional life of a teacher, and so an experienced teacher can approach this task with confidence of being able to make an accurate judgment–one that is valid, fair, and reliable–about his/her learners’ progress in their learning.

Research is a very different matter. The criteria for judging the credibility of a teacher’s research study are considerably more complex; what is more, research is not a daily activity of teachers, and for many in the profession, it may be altogether foreign, making any attempt to conduct a study that can be judged as credible a formidable task. Nevertheless, doing research in their classrooms can be of great value to teachers, providing opportunities for them to increase their knowledge of learning in their particular teaching specialty, to develop creativity, or find solutions to issues affecting learning in their classes, and through each of these means strengthening their professional development. Additionally, by sharing their findings through articles, webinars, and/or sessions at professional conferences, teachers can serve the professional-development needs of their colleagues.

In a study that examined the issue of credibility from the perspective of researchers engaged in the scholarship of teaching and learning (SOTL), Biliot et al (2017) found that SOTL researchers considered the quality of their research to be the most important aspect of their credibility as professionals. These researchers also considered that getting their work published, conducting workshops and making presentations at conferences, and getting their work published to be of high importance as well. Hanuka, in a review of SOTL research, believes that basing research on a strong theoretical framework (i.e. on theories relevant to the topic on which a researcher is focusing) and using proper methodology are critical aspects of credibility for teachers to apply when embarking on a study.

It is therefore important for teachers to understand how to conduct research in a manner that can be judged to be credible not only to local stakeholders but also other professionals in our field. The purpose of this article is to present quality criteria for teachers to keep in mind as they plan, conduct, and analyze their classroom research in order to be well received in the professional community as they report their results.

Criterion #1: selecting the right research approach

The first question for any researcher to ask themselves is “What do I want to do?”. This question leads to selecting the proper research approach for their planned study. Unfortunately, educational program developers and textbook writers have traditionally focused on research designs rather than approaches. Research designs, however, deal with the question of “how” of doing research, and this needs to be preceded by “what”. If a researcher does not clearly understand “what” he or she wants to do, then deciding “how” is not at all helpful.

The nature of much learning–and language learning in particular–is such that studying it in a controlled environment such as a laboratory is not practicable. While experimental studies are often carried out in educational research, they are not the only option and very often do not offer not the best means of helping an individual researcher or team achieve their stated purpose for conducting a study: “I/We want to . . .”.

Thus here are common purposes for choosing to embark on research study in education:

  • Study a program or a group of learners as is.
  • Study statistical relationships among factors affecting learning.
  • Study the effects of a technique or educational material on learning.
  • Explore a possible solution to an issue affecting learning in my classroom.

Each of the above aims is best achieved by a particular research approach:

  • Studying a program or a group of learners as is calls for a naturalistic research approach.
  • Studying statistical relationships among factors affecting learning requires a multivariate approach.
  • Studying the effects of an instructional technique or educational material on learning requires an experimental approach.
  • Exploring a possible solution to an issue affecting learning in your classroom calls for an action research approach.

The beauty of a naturalistic approach is that it offers a researcher an opportunity to examine a learner or group of learners as is in the context in which the learning takes place. It is largely descriptive, making use of repeated observations, as well as such data collection techniques as interviews in order to understand how learning is taking place from multiple perspectives, that is, the views of those who are part of that context, such as students and administrators. It involves a long, inductive process, requiring collection and analysis of a large amount of verbal data in order to arrive at a clear understanding of how learning takes place, be it in the formal setting of a classroom or less formal setting of a coffee shop, dormitory, or student club.

A multivariate approach is useful for providing insights about relationships among two or more factors related to learning. A large-scale study of factors affecting academic achievement of ESL learners in one or more school districts is one example of a study using such an approach. While such a study would be descriptive, like a naturalistic approach, the description of relationships would be primarily or wholly statistical rather than verbal and would not likely make use of multiple perspectives in order to understand what is going on.

An experimental approach is in order when one or more teachers wishes to measure the effectiveness of a particular teaching method/technique, assessment tool, or teaching material. The experiment may involve one or multiple classes and will be deductive in nature, seeking to test one or more statistical hypothesis, to be supported or rejected depending on the results of the data analysis/es conducted.

Finally, an action research approach is used for dealing with a significant issue affecting learning in a particular classroom. Action research may be participatory or practical (Creswell, 2018). While participatory action research often has a social justice aim, practical action research, the type described here, is pedagogical in its purpose, seeking to find a solution to resolve a learning issue.

Criterion #2: selecting the right research design

Once an individual researcher or team has determined what type of approach to use for a study, the next step is to decide how to carry out the study. This is where the matter of design comes in, and the design is dependent on the approach. Among designs within the natural approach, ethnographies and case studies are the best known. To apply an artistic metaphor, an ethnographer seeks to paint as complete a portrayal of a particular learning context as possible. Thus, an ethnographer focusing, let’s say, on the experience of acquiring both languages in a bilingual program will spend an extensive amount of time observing classes, interviewing various stakeholders, administering questionnaires, and collecting and analyzing documents in order to gain as complete an understanding as possible of the bilingual learning experience. A case study researcher, on the other hand, as an artist, will focus on one particularly noticeable aspect of the learning context, such as a single class or learner, in order to gain a more detailed understanding of it.

Designs applying a multivariate approach focus on two or more learning-related factors. In addition to school-district studies described earlier, a multivariate study may involve collection of data related to factors affecting freshman undergraduate GPA, for example, to understand the relative strength of relationship between each independent variable (e.g., SAT score) and the dependent variable (GPA). Alternatively, learners’ performance on tests related to reading comprehension can be compared with their scores on a direct test of reading, such as the ib TOEFL reading section, to measure the extent to which factors such as vocabulary size and reading speed, for example, are related to overall reading ability. Large-scale survey studies are another design type applying a multivariate approach. Such surveys focus on topics such as confidence, willingness to communicate, or language aptitude to understand relationships among learning-related variables.

Unlike studies conducted using a naturalistic or multivariate approach, experimental studies are highly controlled. As in the hard sciences, these studies involve some form of treatment, and require a design that can account for all the variables (factors) that can affect the outcome of the study. Thus there are a number of experimental designs to select from, depending on how many variables one is dealing with. Selection of the right design is important not only for including the relevant variables but also for selecting the correct data analytic procedure. For teacher researchers, the most important types of experimental designs to understand are pure experiments, in which all subjects (participants) are randomly selected from a particular population and then assigned to a treatment group (experimental or control), quasi-experimental studies, in which random selection of subjects is not possible (generally the case in school-based studies), and one-group experiments, in which a single group (generally, students enrolled in the same section of a course) serves as the experimental group. For educational studies, treatment very often involves the use of a post-test, pre-test and post test, or a post-treatment (or pre- and post treatment) questionnaire. Experimental studies focusing on skills generally involve tests or tasks, while those focusing on motivation or self-efficacy (confidence) make use of questions rather than, or in addition to, tests or tasks.

Studies applying a practical action research approach are most commonly experimental, since an instructor is seeking to apply some type of pedagogical strategy, be it a type of task or material, in order to find a solution to an issue affecting learning in a specific class or sections within a course. Where student numbers are very low, six or fewer, a case study design is more appropriate, since statistical analyses featuring very small groups of learners are less stable: an extremely high or low score can significantly impact the mean score if the number of scores in a given dataset is very low.

Criterion #3: displaying knowledge of relevant theory, research, and research gap

Much of a researcher’s authority to conduct a study on any given topic stems from their knowledge of theory and previous studies relating to that topic, based on which the researcher or team identifies a research gap which provides justification for their study. For instance, for a study on the usefulness of allowing learners to provide error feedback on writing assignments in a common first language (e.g., Cantonese for an EAP Writing class in Hong Kong), a researcher would need to provide evidence of strong background knowledge of theory relating to the effectiveness of error correction as well as the development of writing skills in a second language, in addition to awareness of previous research studies relating to error correction in writing, particularly that provided in a learner’s first language, whether it be written or oral and from peers and/or an instructor. The discussion of relevant theory and research, and the research gap which provides a niche for the study in question, would be provided in the literature review or conceptual framework of an article or conference session reporting on a completed study. The stronger the evidence for the researcher’s knowledge of theory and research related to their research topic, the more authority they have to present their own research, and the more credibility they stand to have in the eyes of their colleagues and other professionals in the field.

Criterion #4: stating appropriate research questions/statements and hyptheses

Based on the literature review and identification of the research gap, the researcher is ready to state the general purpose (often in naturalistic research) governing their study or specific research question(s) which their study seeks to answer. The open-ended nature of an ethnography, wherein the researcher seeks to discover and then describe the characteristics of the educational context being studied generally does not lend itself to asking specific questions at the outset. Experimental studies, on the other hand, as well as practical action research studies, are much more focused in purpose, and the articulation of specific research questions allows both the researcher(s) and consumers of studies to have a specific idea of which specific answer(s) are being sought through a particular study. Moreover, in the case of experimental studies, research questions provide a basis for stating the statistical hypothesis/es being tested through the research. This includes action-research studies employing an experimental design. Experiments are carried out to test one or more specific statistical hypotheses, and while articles on many experimental studies do not include specifically-stated hypotheses, stating the hypothesis/es being tested supports the credibility of the study by providing evidence to the readers/listeners concerned that the researcher or team recognizes the hypothesis/es in play.

Criterion #5: Designing or adapting appropriate and well constructed data-collection tools

Empirical research involves collecting data and making conclusions based on analysis of the collected data. It is essential, therefore, that the tools researchers use to collect data are well designed to allow researchers to make valid and reliable conclusions. Two of the most common data collection tools used by teacher researchers are tests and questionnaires. I will therefore focus here on credibility issues related to valid construction and reliable scoring/analysis. These issues also apply to tasks researchers administer during the course of a study.

According to Brown (2018), “a valid test measures exactly what it proposes to measure” (p.32). This is what is referred to as construct validity, that is, actually measuring the skill in question. Thus, a test or task designed to test students’ ability in business writing for example, might test that specific skill by requiring them to compose a letter of complaint regarding a delayed shipment of goods, or a request to reschedule a planned meeting. Such a test or task would also be required to display content validity if it is meant to be based on what has been covered in class, such as in a course textbook. Validity also depends on accurate scoring: answer keys need to be checked for accurate scoring of a multiple-choice grammar or reading test, for example, and any rubric used to evaluate a writing or speaking test/task needs to be rater friendly to ensure accurate scoring and needs to provide an accurate measurement of learners’ ability.

As for reliability, Brown mentions the need for rubrics that allow for consistent scoring, and tasks//items that are unambiguous for test-takers (p. 29); the latter would apply to instructions as well as the lay out of the task/items–a familiar task/test is far easier for students to understand, with respect to how to answer, than an unfamiliar task/test. As well, inter-rater and intra-rater reliability are important issues for grading writing and speaking tests: A reliable rubric will facilitate a high rate of agreement between two raters grading the same test, as well as consistently accurate scoring for all students by a single rater.

One data-collection technique used by TESOL instructors is to use tasks and rubrics from standardized tests. Thus an ibt TOEFL or IELTS Academic writing task and rubric might be used in an EAP writing class or a TOEIC or IELTS speaking task and rubric for a Business English course focusing on speaking. While such tests are internationally recognized for both their validity and reliability, the specific task chosen needs to fulfill the requirement for content validity for the course in question as well, and the rubric needs to be both suitable for the proficiency level of the specific students and user friendly for the teacher researcher doing the scoring. Standardized-test rubrics are very task specific: the IELTS Academic test has two writing tasks with accompanying rubrics, and the ibt TOEFL has both an integrated and independent writing task, each as well with its own rubric. To ensure credibility of adopted data-collection tools, teacher researchers need to understand the specific skills being tested by a particular task on a standardized test to ensure they select the most suitable task for their research purposes, and be able to use the corresponding scoring rubric easily and consistently to provide reliable and valid grading.

As with tests and tasks measuring language skills, questionnaires used for data-collection purposes need to display both construct and content validity, and must consistently measure the target construct (e.g., motivation or self confidence) from a credibility perspective. Thus a questionnaire on motivation must be shown to consistently measure only a single construct, a criterion that can be measured using a statistic such a Cronbach alpha. To strengthen questionnaire credibility, teacher researchers adopt previously designed questionnaires recognized as valid and reliable measures, but need to ensure that students can complete them accurately and that they (the researchers) can interpret the results accurately as well.

One technique to ensure comprehension of a questionnaire created in a learner’s second or third language is to have it translated, in which case checking the translated version for accuracy is required. Questionnaire credibility can also be enhanced by adding 1-2 open-ended items (to a Likert-style questionnaire, for example), which allows learners respond in their own words with respect to the topic of the questionnaire (e.g., motivation). This does require, though, accurate interpretation of learners’ perspectives, which can be challenging if their L1 is not familiar to the researcher. A concern with respect to Likert-style questionnaires using a five-point or seven point scale–in which the middle choice is “Neutral” or Not sure”–is that overuse of the middle response by students wishing to complete the questionnaire quickly detracts from credibility of the tool, since a response of “Neutral” or Not sure” provides no additional information about a student’s perspective, weakening the validity of the instrument.

Criterion #6: Carrying out suitable and sound research procedures

Research procedures vary considerably by approach; understanding the correct procedures for the approach being applied is critical for researchers.

For naturalistic studies, thorough data collection is an absolute necessity for making any claim for credibility with respect to conclusions reached. A sufficiently broad and in-depth process of data collection provides a thick description of the educational context that creates this credibility. Triangulation is the research strategy employed to achieve a thick description. Creswell describes triangulation as “the process of corroborating evidence from different individuals . . ., types of data . . ., or methods of data collection”. Achieving a thick description requires a long-term commitment to collect a variety of data by various means and from different informants. Creswell (2018) identifies a five-step procedure for carrying out data collection in such studies:

  1. identify the specific informants (individuals or groups who are regular participants in the educational context being researched (e.g. teachers and/or administrators in a school, students in a particular class, etc.) from whom you will collect data during your study
  2. obtain permission to carry out research by means of interviews with informants, class observations, etc.
  3. consider what types of information you need to gather
  4. prepare data-collection materials and recording devices
  5. collecting different types of data from your selected informants

Different informants require different means of collecting information based on their preferred communication style and willingness to share their opinions. As this writer discovered doing his dissertation research on a bilingual school using a naturalistic approach, some teachers can be very approachable for interviews and share opinions readily. Others are less communicative with respect to in-person interaction but are willing to complete questionnaires. Likewise with school administrators. Larger groups such as students and parents may be most accessible by means of questionnaires, designed bilingually if need be; this is how I sought the opinions of both groups during my dissertation research. It is best to be flexible, patient and courteous, particularly as an outsider at a research site, and to be well aware of, and careful to follow, ethical guidelines. Developing and maintaining positive relationships with informants is particularly important given the recursive nature naturalistic studies; data collection and analysis in early stages is general, but becomes gradually more focused as patterns emerge during data analysis that enable a researcher to make, and investigate tentative conclusions. These in turn, lead to more structured interviews and questionnaires seeking more specific information as the researcher’s understanding of the site or learner(s)–as in a case study–gains greater clarity until conclusions are confirmed and the portrait is completed.

Experiments involve a very different research process from that used in naturalistic studies. As he does with naturalistic studies, Creswell (2018) identifies a five-step process. The steps are generally the same for the two approaches; however, the manner in which participants are selected and data collection takes place in experimental studies is markedly different from that for ethnographies and case studies:

  • participants are selected using probability sampling: the researcher(s) select, at random, a sample large enough to be considered representative of a larger target population, such as international students living in particular metropolitan area.
  • While permission to carry out a naturalistic study at a school, for example, will be sought from administrators, permission to carry out an experimental study, unless the researcher is conducting the research as a faculty member of a particular institution, may simply involve gaining participants’ consent before the experiment begins to use their data for analytic purposes once they indicate their agreement to take part in a study.
  • As with naturalistic studies, consideration of the type(s) of data to gather, preparation of data-collection materials and recording devises takes place before the actual research begins. However, an experimental study involves grouping of participants in line with the specific design selected for the study. A pure experiment will involve random assignment of each participant in either the control group, or the experimental group–or one of two or three experimental groups, depending on the design of the study.

While a researcher or team conducting a naturalistic study depends on a thorough and lengthy data-collection process to establish credibility, experimental researchers depend on a very carefully conducted treatment process to achieve that same purpose. Participants in the control group for a study will engage in some type of activity, but that chosen will be selected and conducted carefully so as not to cause participants in that group to do any language practice that could affect the results of the study. On the other hand, the participants in each experimental group will participate in one or more activities designed to test the hypothesis articulated for that study. For example, for an experiment testing a hypothesis that L1 peer feedback of students’ paragraph drafts has no impact on performance of Mandarin native speakers in an EAP writing class, participants in the control group would write the first draft of a paragraph on the topic of “My life hero”, then watch a video on famous heroes in history before writing their second draft. Participants in Experimental Group #1 would receive feedback on their first draft from their instructor in English, using a pre-designed error feedback key, while watching the same video as those in the control group before writing their second draft. Finally, participants in Experimental Group #2 would provide feedback on one another’s first drafts in Mandarin after receiving directions from their instructor on how to apply the same error feedback key used for Experimental Group #1.

For multivariate studies, credibility depends both on selection of participants for a given study, as well as the reliability and validity of any data-collection instruments employed during the study. With respect to surveys, for instance, the researcher(s) must first identify their target group of participants in line with the research question(s) articulated for the study. For a survey on the impact of COVID-related campus access policies on teaching and learning of ESL/EFL during 2020 and the first half of 2021, this writer chose as the target participant group current and foreign students in the graduate TESOL program in which he is an instructor. This was a group for which I could easily and ethically access their content information and which consisted of a large portion of individuals who knew me either as a former instructor of some of their courses or a colleague of those who had taught them. After gaining permission to conduct the study from my institution’s research ethics board, I set up the survey using an online server (Survey Monkey) and distributed the survey electronically with a preceding description of the study which included an invitation to participate in an optional interview afterwards for which those who took part received a promised Amazon gift certificate. The description also included a notice of consent indicating that their decision to complete the online survey would be taken as their consent to participate in the study and allow me, as the researcher to use their responses for research purposes. Finally, the survey was designed so that each participant could complete it anonymously; it contained no information that could be used to specifically identify any participant, although those participants who agreed to participate in the optional interview were, with their foreknowledge, identifiable to me, though no report mentioned any of their names. Each interview was recorded with the participant’s knowledge. Overall, the response rate for the survey was about thirty percent, a ratio considered to be about average in our profession.

Action research studies employ a unique research procedure from that used in other approaches. Because teachers conducting action research are working with their own students as participants, special ethical procedures must be followed to protect participants from a “power over” relationship that exists since it is the teacher who determines final grades. Teachers planning to conduct action research can choose from two procedural options:

The first option, which is the less common of the two, involves carrying out an action research study with a single or group of students within a class one is teaching. This is mostly likely due to diagnosis by the teacher of a learning issue that she/he would like to address in the short term but does not have class time available for such an activity. For example, an elementary teacher may identify some students whose word-reading skills are particularly weak in comparison to other students in the class and wish to do some additional phonics work with those students. In such a case, the teacher concerned will seek permission to carry out the activity, in the form of an action research study, after school or at some other time that can be arranged. Permission would be needed from school or program administration to conduct the study, and consent needed from the learners’ parents or guardians, for participation. Program/institutional ethical guidelines must be strictly adhered to, including proper procedures for working with participants who are minors. In addition, it is important to protect the confidentiality of data collected and the anonymity of participants.

The second option, in which formal research takes place after a course has been completed, is more common. In this case, during the teaching phase of the study, the teacher researcher conducts a “curricular innovation”–that is, makes an adjustment to normal pedagogical procedures in order to implement a technique, activity, or material that he/she believes could help to resolve a learning issue impacting students. For example, a teacher who finds that students take a long time to read passages because they constantly consult their dictionaries may implement timed readings with no dictionary use permitted in order to help wean students of their over-dependence on dictionaries and build their confidence as readers. In such a case, while data collection, involving task worksheets, a pre/post-test and/or questionnaire, would take place during class, formal analysis of the data collected would not take place until after the course is completed and final grades submitted. This delay negates pressure on learners (or their parents/guardians) relating to the power relationship mentioned earlier, since the researcher was also the course teacher. Once grades are submitted, the teacher researcher would invite students or adults responsible for them to participate in the research phase by seeking their consent to use their (or their child’s) data for formal research analysis. As with the first option, data confidentiality and participants’ anonymity are prime concerns.

Credibility in action research procedures depends, first, on the researcher(s) showing evidence of having done sufficient background research to gain a clear understanding of the issue being studied. This includes an awareness of the challenges faced by the learners in the particular educational context as well as knowledge of theory and research related to the general learning issue involved (for example, willingness to communicate). Second, it depends on providing one or more formally stated research questions, and where relevant, statistical hypotheses. Third, and most importantly, it depends on the selection of the optimal research design for the context and research focus concerned. For example, in cases where the teacher wishes to apply a curricular innovation, a one-group experimental design (or a quasi-experimental design, if two classes are involved) is the best choice, unless the size of the overall class is very small. An experimental design allows for a pre/post-test procedure to measure the effectiveness of the curricular innovation the teacher wishes to try out. A questionnaire can also be used to assess effectiveness by gleaning the students’ opinion of the innovation. Task worksheets provide additional information, as can a teacher diary, in which the teacher records his/her own observations and opinions during the teaching phase of the study. In a case where a teacher is working with a small class (six students or less) or subgroup within a class, a case study design would likely be the best choice. A small group makes an experimental design a risky choice due to the instability of data analytic procedures with such a small amount of data. Moreover, a small group of participants allows for an in-depth analysis of treatment effectiveness by making use of student journals, open-ended questionnaires (in which participants provide written responses to questions), and, where a sufficient level of trust exists, face-to-face interviews or a focus-group discussion. The latter two options, posing a significant threat to both confidentiality and anonymity of data, require that participants be made aware beforehand of the risks involved.

Credibility in action research procedures is also affected by the manner in which the intervention–treatment–is carried out. The length of the study, the clarity of procedures for the learners, and steps taken to ensure accurate scoring of tests, analysis of questionnaires, and proper storage and protection of collected data all impact credibility.

Criterion #7: Conducting appropriate data analyses

Procedures for analyzing collected data vary by research approach. There are two reasons for this. First, naturalistic and multivariate studies both use an inductive method to draw conclusions, while experimental studies use a deductive method to test one or more hypotheses. Action research may be deductive or inductive, depending on whether an experimental or case-study design is chosen. Second, numeric and other data require different methods of analysis to enable the researcher(s) concerned to interpret the results. This is particularly important in a study where the researcher collectives both numeric and other data in order to strengthen the credibility of the findings. A naturalistic researcher, for example, may use numeric data such as test scores, questionnaire responses, and biodata such as age or years of study to gain a more in-depth understanding of the context being studied. Conversely, a researcher conducting an experimental study may make use of interviews to have more information available for rejecting or supporting a null hypothesis concerning the effectiveness of a particular technique.

The steps for analyzing both numeric and other data are as follows (based on Creswell, 2018):

For numeric data, the main steps are as follows

  1. Score the data, that is, calculate scores (test or task) or tabulate results (questionnaire or survey)
  2. Select a data analysis application
  3. Select a data analytic procedure
  4. Input the data
  5. Run the analysis
  6. Interpret the result

Despite the apparent fear with which many students in education view statistics, much statistical analysis in educational research is not complex because the conditions present in large-scale studies–a large population of potential subjects from which to randomly select a sufficient number of participants to allow for claims to external validity–don’t exist in our classrooms. Complex statistic analyses generally require large numbers of participants to yield meaningful results. A class of 10-20 learners, or even 25-30 learners does not meet the conditions to warrant such analyses. Indeed, much can be learned from what are known as descriptive statistics–such as the number of answers for each choice in a Likert-style questionnaire or the mean score for a test–statistics which teachers already learned at some point during their schooling and which are very handy for much of the research teachers can most practically carry out: distributing anonymous questionnaires to their students on a class topic or regarding the impact a class has had on their motivation or self confidence, or administering a task or test which can be used–within ethical parameters–for research purposes. Other descriptive statistics common in education include the percentile, range, median, and standard deviation, all of which are useful for understanding test results.

The other branch of statistics used in educational research is inferential statistics. These are applied, as the term suggests, to allow researchers to make conclusions based on the results of statistical analyses. The ANOVA family of analyses, for example, enable researchers to understand effects of independent variables (e.g., first language, native country, level of education) on dependent variables (grammatical accuracy or reading fluency, for example). Correlation analyses provide information about statistical relationships among variables, such as between grammatical knowledge and overall score on the IELTS test. Reliability analyses such as Cronbach alpha, KR 20 and KR 21, provide understanding of the consistency with which a test or questionnaire measure only a single ability (e.g., vocabulary knowledge) or personal characteristic (motivation, confidence, or willingness to communicate). In experimental research, inferential analyses provide information supporting or rejecting a statistical hypothesis being tested in a particular study.

For verbal data, the analytic process is markedly different. The most important difference between numeric and verbal analyses is that the former is a linear process generally occurring once during a survey study, for example while the latter is generally an iterative process that becomes gradually more focused during the course of a longitudinal study such an ethnography or case study. Creswell (2018) identifies the following steps:

  1. transcribe and organize data
  2. explore the transcribed data
  3. code the data
  4. describe it
  5. identify and connect themes

Analysis during naturalistic studies reflects the aim of a researcher or team to provide a thick description of the research context as mentioned earlier, a process that requires considerable time, and numerous iterations, to be sufficiently thorough. With multiple informants involved, as well as multiple types of data, the process of transcribing, exploring, and describing data is one that needs to be repeated often before themes are developed that allow the researcher(s) to reach a credible conclusion. Where numeric data is also collected over the course of a naturalistic study, the findings from statistical analyses need to be accounted for as themes are identified. The more thorough the triangulation process, the “thicker” the description that results, and the more credible the conclusions made.

Criterion #8: Making appropriate interpretations and conclusions

The elements of research credibility discussed up to now serve as building blocks for making appropriate interpretations and conclusions for any research study. Selecting the right research approach at the outset is critical to choosing and applying the most appropriate design. Identifying the research questions you seek to answer and stating appropriate statistical hypotheses when needed will serve as helpful guidelines for selecting participants, creating or adopting data collection instruments, applying the correct data collection procedures, and using appropriate data analytic analyses. Credibility is also dependent on a display of sufficient knowledge of theory and previous research relevant to your research topic; such knowledge lends authority to the conclusion(s) reached by the researcher(s), as does knowledge and application of ethical constraints governing one’s study. Additionally, credibility is dependent on logical interpretations made based on results from data analyses; answers to research questions must make sense based on the findings, and support or rejection of any statistical hypotheses governing a study must likewise be logical based on results of the statistical analysis/es carried out.

Finally, credibility is either strengthened or weakened by a clear statement of the limitations pertaining to a particular study; no research study is perfect. A naturalistic researcher is dependent on the quality of information provided by participants, and ethical restraints can prevent access to data that could add valuable insights. A good experiment has a specific focus and features careful selection of participants; however, such requirements tend to limit the external validity of conclusions reached. Participation in survey research is generally well under 50% of participants initially recruited, and it is strongly suspected that those who do participate in surveys are rather different from those who do not. Lastly, action research studies, designed to find solutions to learning challenges affecting a particular classroom, do not lend themselves to extending conclusions beyond the group of students being researched.

In conclusion . . .

Conducting high-quality educational research does not require one to be a statistical genius. Knowledge of the criteria for credible research in education is far more important than mathematical expertise. Taking a course in research methods can be a great asset to any teacher’s professional development, both as a consumer and producer of research. As a teacher begins and then becomes adept at research over time, then the following will apply (Moulden, 2021):

Knowledge feeds practice.

Knowledge-based practice builds excellence.

Excellence breeds fruitfulness.


Billot, J., Rowland, S., Carnell, B., Amundsen, C., & Evans, T. (2017). How Experienced SoTL Researchers Develop the Credibility of Their Work. Teaching & Learning Inquiry5(1).

Brown, H. D. (2018). Language assessment: Principles and classroom practices (3rd ed.). Pearson Education.

Creswell, J. W. (2018). Educational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research. Sixth Edition. Pearson Education, Inc.

Kanuka, H. (2011). Keeping the scholarship in the scholarship of teaching and learning. International Journal for the Scholarship of Teaching and Learning5(1), 1-12.

Moulden, G. A. (2021, December 30). Research: the classroom teacher’s Swiss army knife. https://wordpress.com/post/tesol.solutions/327

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s