Sabtu, 16 Mei 2020

PERTEMUAN 15: ASSESSING GRAMMAR AND ASSESSING VOCABULARY


CHAPTER ONE: Differing nations of ‘grammar’ for assessment
A. Language views (General approaches) 
1. Syntactocentric perspective
It is predominantly concerned with the structure of clauses and sentence. This view defines “grammar” as a systematic way of accounting for and predicting an “ideal” speaker’s or hearer’s knowledge of the language.
2. Communicative prespective
It focuses more on the overall message being communicated and the interptretations that this message might invoke. “Grammar” is treated as one of the many resources for accomplishing something with language.

B. Syntactocentric perspectives of the language
1. Traditional grammar
It is based on a set of perspective rules along with the exceptions. It is criticized for this its inability to provide descrioptions of the language that could eduquately incorporate the exceptions into the framework and for its lack of generalizability to other languages.
2. Structural grammar
It is describe the structure of the language in terms of both its morphology and its syntax, in which aech word in a given sentence is categorized according to its use and the “patterns” or “structures” are said to constitute a unique system for that language.
3. Transformational-generative grammar
It provide a “universal” description of language behaviour revealing the internal linguistic system for which  all human are predisposed. Underlying properties of any language system can be uncovered by means of a detailed sentence-level analysis. This universal Grammar (UG) has been criticized for failing to account for meaning or language use in social context.
C. Corpus linguistics
1. The most common practice of compiling linguistic corpora, or large and principled collections of natural, authentic spoken and written texts. It shows how often and where a linguistic form occurs in spoken or written text.
2. It provide information on pattern of variation in language use, language change, and varieties of language. It also provides information on the different semantic functions of lexical items, distributional and frequency information on the lexico-grammatical features of the language.
3. It challenges languages teachers to rethink how they view the content of a language curriculum and the manner in which this curriculum is presented to students
Katz and Fodor (1963) found that in addition to enconding semantic features and restrictions, a word also contains a number of syntactic features including the part of speech (noun,verb, adjective), countability (singular, plural), gender (masculine, feminime), and it can mark prepositional co-occurrence restrictions such as when the word think is followed by a preposition (about, of, over) or is followed by a that-clause. Katz and Fodor called this ‘the grammatical dimension of lexis’.

D. Theories of Communicatikn
1. Systematic-functional grammar 
Context and meaning take precedence over linguistic form. It typically describes features of grammatical form that are used to express meaning beyond a single, context-free utterance. Rather, grammatical form is seen as having a symbolic relationship with meaning and pragmatic use, where each influences and shapes the other within and across utterances. 
2. Speech act theory
Effective communication is not simply perceived as a function of linguistic accuracy or acceptable grmmar to convey literal and intended meaning. Communication must be appropriate for the context, i.e. speakers must have both ‘linguistic competence’ and ‘communicative competence’.
Both have had a considerable impact on L2 syllabus design, teaching and testing, and are credited for shifting the emphasis of language classrooms from a from a formal grammatical focus to a communication-based one.

E. Pedagogical grammar 
1. It represent as an eclectic, but principled description of the target-language forms, created for the express purpose of helping teachers understand trhe linguistic resources of communication.
2. These grammars provide information about how language is organized and offer relatively accessible ways of describing complex, linguistic phenomena for pedagogical purposes.
3. The more L2 teachers understand how the grammatical system works, the better they will be able to tailor this information to their specific instructional contexts. 
4. Besides formal pedagogical grammars (and, of course, SLA theory), language teachers would be advised to consult language textbooks when put to the task of specifying grammatical content for instruction or assessment.
5. These books not only provide descriptions, albeit less comprehensive, of the target grammar, but they also inform teachers of the scope with which a grammar point might be treated at a particular proficiency level or the sequence with which grammar points might be introduced.
CHAPTER TWO: Research on L2 grammar teaching and learning
Research on L2 teaching and learning
The SLA research looking at the role of grammar instruction in SLA might be categorized into three strands. One set of studies has looked at the relationship between the acquisition of L2 grammatical knowledge and different language-teaching methods. These are referred to as the comparative methods studies. A second set of studies has examined the acquisition of L2 grammatical knowledge through what Long and Robinson (1998) call a ‘non-interventionist’ approach to instruction. These studies have examined the degree to which grammatical ability could be acquired incidentally (while doing something else) or implicitly (without awareness), and not through explicit (with awareness) grammar instruction. A third set of studies has investigated the relationship between axplicit grammar instruction and the acquisition of L2 grammatical ability. These are referred to as the interventionist studies, and area a topic of particular interest to language teachers and tetsters 
Comperative methods studies
These studies were in reaction to form-focused instruction (referred to as ‘focus on forms’ by Long, 1991), which used a traditional structural syllabus of grammatical forms as the organizing principle for L2 instruction. According to Ellis (1997), form-focused instruction contrasts with meaning-focused instruction in that meaning-focused instruction emphasizes the communication of messages (i.e., the act of making a suggestion and the content of such a suggestion) while form focused instruction stresses the learning of linguistic forms. These can be further contrasted with form-and-meaning focused instruction (referred to by Long (1991) as ‘focus-on-form’), where grammar instruction occurs in a meaning-based environment and where learners strive to communicate meaning while paying attention to form.
Non-Interventionist studies
While some language educators were examining different methods of teaching grammar in the 1960s, others were feeling a growing sense of dissatisfaction with the central role of grammar in the L2 curriculum. As a result, questions regarding the centrality of grammar were again raised by a small group of L2 teachers and syllabus designers who felt that the teaching of grammar in any form simply did not produce the desired classroom results. Newmark (1966), in fact, asserted that grammatical analysis and the systematic practice of grammatical forms were actually interfering with the process of L2 learning, rather than promoting it, and if left uninterrupted, second language acquisition, similar to first language acquisition, would proceed naturally.
Incremental leaps in grammatical ability through an accumulation of grammatical forms, as presented in a traditional grammar syllabus, learners in both instructed and naturalistic settings acquired the target structures in a relatively fixed order (Ellis, 1994) regardless of when they were introduced.
Learners acquiring any individualgrammaticalfeaturesuchasnegatives,interrogatives,relative clauses, word order, or pronouns appeared to pass through a relatively fixed developmental sequence toward mastering that form (Ellis, 1994).
The empirical evidence of ordered acquisitional patterns coupled with dissatisfaction with the results obtained from grammar teaching led a few SLA researchers to call for the total abandonment of traditional grammar instruction in the L2 classroom. Drastic as this was, researchers supporting this position (e.g., Krashen, 1982; Prabhu, 1987) argued that an L2 is not actually acquired through formal instruction; rather, it is learned incidentally and implicitly through exposure to the target language, as long as the input that learners are exposed to is made comprehensible. These researchers further claimed that in input-rich settings, the learner’s attention is focused solely on meaning in natural communication, and any form of explicit error correction is harmful to the acquisitional process. Supporters of this position further maintained that grammar acquisition was impervious to form-focused instruction, since the ‘natural’ processes of acquisition were at work. In other words, learners progress toward native-like proficiency in a predetermined order, making a number of predictable interlanguage errors, regardless of any instructional intervention. Finally, some researchers (e.g., Pica, 1983) found that learners who, in fact, did receive form-focused instruction showed an order of acquisition of grammatical features similar to that seen with the naturalistic learners, lending further support to the noninterventionist position.
Empirical studies in support of non-intervention
The non-interventionist position was examined empirically by Prabhu (1987) in a project known as the Communicational Teaching Project (CTP)insouthernIndia.Thisstudysoughttodemonstratethatthedevelopment of grammatical ability could be achieved through a task-based, ratherthanaform-focused,approachtolanguageteaching,providedthat the tasks required learners to engage in meaningful communication.
The non-interventionist position can be credited with showing us (1) that learners appear to acquire different grammatical structures in a fixed ‘acquisitional order’ and the same structure in a fixed‘acquisitional sequence’, (2) that meaning-focused classrooms can promote the development of L2 fluency provided there are plenty of opportunities for meaningful communication and (3) that meaning-focused classrooms can promote the development of grammatical ability no less than traditional classrooms, although, as we will see, this may be in a dequate for promoting high levels of SLA in a timely and efficient manner.
Possible implications of fixed developmental order to language assessment
Grammar tests targeting beginning English-language learners often include questions on the articles and the third-person singular -s affix, two features considered to be ‘very challenging’ from an acquisitional perspective.
The inclusion of these items in a placement test would be highly appropriate since the goal of placement assessment is to identify a wide range of ability levels so that developmentally homogeneous groups can be formed.
Problems with the use of development sequences as a basis for assessment
First, the number of grammatical sequences that show a fixed order of acquisition is very limited, far too limited for all but the most restricted types of grammar tests.
Second, much of the research on acquisitional sequences is based on data from naturalistic settings, where students are provided with considerable exposure to the language.
Third, as the rate (not the route) of acquisition appears to be influenced by the learner’s first language and by exposure to other languages, we need to understand how these factors might impact on development rates and how we would reconcile this if we wished to test heterogeneous groups of language learners. 
Finally, as the developmental levels represent an ordering of grammatical rules during acquisition, this may or may not be on the same measurement scale as accuracy scores.
Interventionist studies
Schmidt, 1983; Swain, 1991) have maintained that although some L2 learners are successful in acquiring selected linguistic features without explicit grammar instruction, the majority fail to do so.
Most language teachers would contend that explicit grammar instruction, including systematic error correction and other instructional techniques, contributes immensely to their students’ linguistic development.
Empirical studies in support of intervention
Aside from anecdotal evidence, the non-interventionist position has come under intense attack on both theoretical and empirical grounds with several SLA researchers affirming that efforts to teach L2 grammar typically results in the development of L2 grammatical ability. Hulstijn (1989) and Alanen (1995) investigated the effectiveness of L2 grammar instruction on SLA in comparison with no formal instruction. They found that when coupled with meaning-focused instruction, the formal instruction of grammar appears to be more effective than exposure to meaning or form alone. Long (1991) also argued for a focus on both meaning and form in classrooms that are organized around meaningful and sustained communicative interaction. He maintained that the focus on grammar in communicative interaction serves as  an aid to clarity and precision.
Research on instructional techniques and their effects on acquisition
Form- or rule-based techniques revolve around the instruction of grammatical forms. They can involve implicit, inductive grammar teaching, where the focus is on meaning, but the goal is to attract the learner’s attention to the form without using grammatical metatalk, or linguistic terminology.
Input-based techniques deal with how input is used in grammar instruction. One such technique is input flooding, where learners are presented with large amounts of input in which the targeted feature is present. Another involves typographical input enhancement, where input is manipulated by means of capitalization, printing in boldface and so forth. Comprehension practice is an input-based technique, where learners are asked to relate grammatical form to meaning – often by means of pictures or meaning-focused questions. 
Feedback-based techniques involve ways of providing negative evidence of grammar performance. For example, ‘recast’ is a feedback based technique, where an utterance containing an error is repeated without the error. Another is referred to as ‘garden path’ since learners are explicitly shown the linguistic rule and allowed to generalize with other examples; however, when the generalization does not hold (negative evidence), further instruction is provided. Finally, metalinguistic feedback involves the use of linguistic terminology to promote ‘noticing’. 
A final set of instructional techniques mentioned by Norris and Ortega (2000) are practice-based techniques of grammar instruction. These involve input-processing instruction and output practice (Lee and VanPatten, 2003).
Grammar processing and second language development
In the grammar-learning process, explicit grammatical knowledge refers to a conscious knowledge of grammatical forms and their meanings. Explicit knowledge is usually accessed slowly, even when it is almost fully automatized (Ellis, 2001b). DeKeyser (1995) characterizes grammatical instruction as ‘explicit’ when it involves the explanation of a rule or the request to focus on a grammatical feature. Instruction can be explicitly deductive, where learners are given rules and asked to apply them, or explicitly inductive, where they are given samples of language from which to generate rules and make generalizations.
Implicit grammatical knowledge refers to ‘the knowledge of a language that is typically manifest in some form of naturally occurring language behavior such as conversation’ (Ellis, 2001b, p. 252). In terms of processing time, it is unconscious and is accessed quickly. DeKeyser (1995) classifies grammatical instruction as implicit when it does not involve rule presentation or a request to focus on form in the input; rather, implicit grammatical instruction involves semantic processing of the input with any degree of awareness of grammatical form.
Implications for assessing grammar
The information from these assessments would show how well students could apply the forms in contexts where fluent and spontaneous language use is not required and where time could be taken to figure out the answers. Inferences from the results of these assessments could be useful for teachers wishing to determine if their students have mastered certain grammatical forms.
To obtain information on the students’ implicit knowledge of grammatical forms, testers would need to create tasks designed to elicit the fluent and spontaneous use of grammatical forms in situations where automatic language use was required.
CHAPTER THREE: The role of grammar in models of communicative language ability 
The role of grammar in models of communicative competence
Rea-Dickins’definition of grammar
Rea-Dickins (1991) defined ‘grammar’ as the single embodiment of syntax, semantics and pragmatics. 
Rea-Dickins (1991) further stated that the goal of communicative grammar tests is to provide an ‘opportunity for the test-taker to create his or her own message and to produce grammatical responses as a ppropriate to a given context’ (p. 125). This underscores the notion that pragmatic appropriateness or acceptability can add a crucial dimension to communication, and must not be ignored.
Rea-Dickins’ emphasis on grammar as pragmatics correctly reminds us of the close relationship among grammar, semantics and pragmatics. She also reminds us that the distinctions between the selevels are at times fuzzy at best.
Larsen-Freeman’s definition of grammar
Drawing on several linguistic theories and influenced by language teaching pedagogy, she has also characterized grammatical knowledge along three dimensions: linguistic form, semantic meaning and pragmatic use. Form is defined as both morphology, or how words are formed, and syntactic patterns, or how words are strung together. This dimension is primarily concerned with linguistic accuracy. The meaning dimension describes the inherent or literal message conveyed by a lexical item or a lexico-grammatical feature. This dimension is mainly concerned with the meaningfulnessof an utterance. The use dimension refers to the lexico-grammatical choices a learner makes to communicate appropriately within a specific context. Pragmatic use describes whenand whyone linguistic feature is used in a given context instead of another, especially when the two choices convey a similar literal meaning. In this respect, pragmatic use is said to embody presuppositions about situational context, linguistic context, discourse context, and sociocultural context. This dimension is mainly concerned with making the right choice of forms in order to convey an appropriate message for the context. 
According to Larsen-Freeman (1991), these three dimensions may be viewed as independent or interconnected. For example, a linguistic form such as the articles in English displays a syntactic, semantic and pragmatic dimension, even though, perhaps in the classroom, it might be necessary to focus more on the pragmatic aspect, which can pose the greatest challenge to learners.
What is meant by ‘grammar’for assessment purposes? 
 
CHAPTER FOUR: Towards a definition of grammatical ability
What is meant by grammatical ability?
Grammatical ability is, then, the combination of grammatical knowledge and strategic competence; it is specifically defined as the capacity to realize grammatical knowledge accurately and meaningfully in testing or other language-use situations.
What is ‘grammatical ability’ for assessment purposes?
First, grammar encompasses grammatical form and meaning, whereas pragmatics is a separate, but related, component of language. A second is that grammatical knowledge, along with strategic competence, constitutes grammatical ability. A third is that grammatical ability involves the capacity to realize grammatical knowledge accurately and meaningfully in test-taking or other language-use contexts. The capacity to access grammatical knowledge to understand and convey meaning is related to a person’s strategic competence. It is this interaction that enables examinees to implement their grammatical ability in language use. Next, in tests and other language-use contexts, grammatical ability may interact with pragmatic ability (i.e., pragmatic knowledge and strategic competence) on the one hand, and with a host of non-linguistic factors such as the test-taker’s topical knowledge, personal attributes, affective schemata and the characteristics of the task on the other. Finally, in cases where grammatical ability is assessed by means of an interactive test task involving two or more interlocutors, the way grammatical ability is realized will be significantly impacted by both the contextual and the interpretative demands of the interaction.
Knowledge of phonological or graphological form and meaning
 
Knowledge of lexical form and meaning
Knowledge of lexical form enables us to understand and produce those features of words that encode grammar rather than those that reveal meaning. This includes words that mark gender (e.g., waitress), countability (e.g., people) or part of speech (e.g., relate, relation).
Knowledge of lexical meaning allows us to interpret and use words based on their literal meanings. Lexical meaning here does not encompass the suggested or implied meanings of words based on contextual, sociocultural, psychological or rhetorical associations.
Knowledge of morphosyntactic form and meaning
Knowledge of morphosyntactic form permits us to understand and produce both the morphological and syntactic forms of the language. This includes the articles, prepositions, pronouns, affixes (e.g., -est), syntactic structures, word order, simple, compound and complex sentences, mood, voice and modality.
Morphosyntactic forms carry morphosyntactic meaningswhich allow us to interpret and express meanings from inflections such as aspect and time, meanings from derivations such as negation and agency, and meanings from syntax such as those used to express attitudes (e.g., subjunctive mood) or show focus, emphasis or contrast (e.g., voice and word order).
Knowledge of cohesive form and meaning
Knowledge of cohesive form enables us to use the phonological, lexical and morphosyntactic features of the language in order to interpret and express cohesion on both the sentence and the discourse levels. Cohesive form is directly related to cohesive meaning through cohesive devices (e.g., she, this, here)which create links between cohesive forms and their referential meanings within the linguistic environment or the surrounding co-text.
Knowledge of information management form and meaning
Knowledge of information management form allows us to use linguistic forms as a resource for interpreting and expressing the information structure of discourse. Some resources that help manage the presentation of information include, for example, prosody, word order, tense-aspect and parallel structures. These forms are used to create information management meaning.
Knowledge of interactional form and meaning
Knowledge of interactional form enables us to understand and use linguistic forms as a resource for understanding and managing talk-ininteraction. These forms include discourse markers and communication management strategies. Discourse markers consist of a set of adverbs, Knowledge of interactional form and meaning
Knowledge of interactional form enables us to understand and use linguistic forms as a resource for understanding and managing talk-ininteraction. These forms include discourse markers and communication management strategies. Discourse markers consist of a set of adverbs, Conversation-management strategies include a wide range of linguistic forms that serve to facilitate smooth interaction or to repair interaction when communication breaks down.
Similar to cohesive forms and information management forms, interactional forms use phonological, lexical and morphosyntactic resources to encode interactional meaning.
CHAPTER FIVE: Designing test tasks to measure L2 grammatical ability
How does test development begin?
Every grammar-test development project begins with a desire to obtain (and often provide) information about how well a student knows grammar in order to convey meaning in some situation where the target language is used. The information obtained from this assessment then forms the basis for decision-making. Those situations in which we use the target language to communicate in real life or in which we use it for instruction or testing are referred to as the target language use(TLU) situations (Bachman and Palmer, 1996). Within these situations, the tasks or activities requiring language to achieve a communicative goal are called the target language use tasks. A TLU task is one of many languageuse tasks that test-takers might encounter in the target language use domain. It is to this domain that language testers would like to make inferences about language ability, or more specifically, about grammatical ability.
A basis for identifying and making explicit the areas of grammatical knowledge to be measured, as seen in Figure 5.1.
 
Example 1:Multiple-choice task 
Designed to test grammatical form (morphosyntax-word order) 
Directions: Circle the correct answer.
A: Can’t Tom drive us to the airport? 
B: He has ____ to take us all. 
(a) such small a car (c) a too small car 
(b) very small a car (d) too small a car ✔ key ✔
Example 2:Multiple-choice task 
Designed to test grammatical form and meaning (cohesive-ellipsis) 
Directions: Circle the correct answer.
A: Will you and Ann go away this summer? 
B: I imagine ____. 
(a) it (c) that 
(b) so ✔ (d) we’ll
Example 3:Multiple-choice task 
Designed to test grammatical form and meaning (multiple areas) 
Directions: Circle the correct answer.
A: Wow! You got a new hairdo. I love it! 
B: Thanks, but ____________________________________ 
A: No, you don’t. You look great! 
(a) I liked it the other way. (c) You look great. 
(b) What happened to you? (d) I look ridiculous! ✔
Example 4:Multiple-choice error identification task 
Designed to measure grammatical form 
Directions: Circle the letter corresponding to the error.
As my car had broken down, I decided to go there by foot. 
A B C D ✔
Example 5:Matching task 
Designed to measure grammatical meaning (denotation) 
Directions: Match the letter of the underlined word(s) with its meaning. 
Write the letter on the line.
Last week while Tom and Jane were having dinner in a restaurant, thieves (a) broke down the front door of their pretty little house, went inside and (b) broke intotheir safe. Now they’re (c) broke.
___ 1. poor ___ 3. enter to steal something 
___ 2. make into two or more pieces ___ 4. enter by force
Example 6:Discrimination task 
Designed to measure morphosyntactic meaning 
Directions: Match the sentence with the picture by writing the number in the box on the line.

___ Se la entregó a ella. [He delivered it to her.]
Example 7: Noticing task 
Designed to measure grammatical meaning (morphosyntactic meaning) 
Directions: Circle ‘would’ when it refers to the habitual past. Underline it when it refers to the present or future.
You know? You think you’ve got it bad. When I was a kid, we would have to walk up hill to and from school every day. We would even do it when it snowed 
– winter and summer. And we would never even think of complaining. We would smile and go about our business. I wouldn’t change those days for anything. Would you now please ‘shut up’ and take out the garbage?
Example 8: Gap-filling task 
Designed to measure grammatical form and meaning 
Directions: Fill in the blank with an appropriate form of the verb.
In about 20 AD Apicus was well known for the cookbooks he (1)
________________ in his spare time. He was equally famous for the lavish meals he (2) ________________ for his family and guests. 
(Adapted from Purpura and Pinkley, 2000)
Example 9: Short-answer task 
Designed to measure grammatical form and meaning 
Directions: Use the job ad to complete the application form.
 
Name: Job applied for: 
Qualifications for job applied for: 
Current job: Reason for leaving:
Example 10: Discourse completion task 
Designed to measure grammatical form and meaning on the discourse level 
Directions: Complete the conversation the two friends are having.
A: I can’t believe that disgusting little waiter told me ‘to get a life’ when I showed him the hair in my soup. 
B: Well, if I were you, ___________________________________________! 
A: Nah, I don’t want to be rude.
 
Example 12: Reporting task  
Designed to measure grammatical form and meaning on the sentence and discourse levels Directions – Part A: Last night there was a break-in at the Santellis’. You are the detective on the case. For each piece of evidence below, make a written speculation about the burglary. Use modals whenever possible.
1. The kitchen lock was forced open and a window was broken. 
2. Traces of cookies and milk were found on the kitchen counter. 
3. There was a wet towel in the shower. 
4. All of Mrs. Santelli’s diamonds are missing.
Directions – Part B: Based on the evidence, draw some tentative conclusions about the thief. Write a brief progress report on the situation for a new colleague on the case. 
(Adapted from Purpura and Pinkley, 2000)
Example 13: Simulation task 
Designed to measure grammatical form and meaning on the discourse level 
Directions: Your local government has just received a large amount of money to solve one of its problems. You are on the committee to decide which one to solve. You will be given a problem to advocate for. Your job is to convince your group that the city should solve yourproblem first. You will have five minutes to plan your argument.
Once you are in your group, describe your problem to the others.When you hear all the problems,work together to decide which problem the city should solve first.Try to get your problem solved first.
(Each student is given only one role)
Person A The city is upset about pollution. There are more and more cars every year, and they are aggravating the pollution problem. The government does not want to make pollution laws because it is afraid factories will close. However, more and more people are having pollution-related health problems. The city needs money to help the factories install anti-pollution technology.
Person B The city is worried about crime. In some neighborhoods crime has increased dramatically within the last year, and people are afraid to walk in certain areas at night. More and more people are reporting street crimes. Recently thieves broke into a bank and stole millions. Violent crime is increasing too. The city needs money to hire more policemen and to install modern crime technology. Person C The schools are in desperate need of help. Classrooms are overcrowded and buildings are falling apart from lack of maintenance. New teachers do not want to begin their careers in these conditions and veteran teachers are leaving the schools to accept jobs in the suburbs, where they are paid twice as much. The schools also need funds to support ESL instruction for growing numbers of immigrant students. Every child deserves to have the opportunity for a good education. 
(Adapted from Purpura and Pinkley, 1999)
CHAPTER SIX: Developing tests to measure L2 grammatical ability
What makes a grammar test ‘useful’?
The quality of reliability
When we talk about ‘reliability’ in reference to a car, we all know what that means. A car is said to be reliable if it readily starts up every time we want to use it regardless of the weather, the time of day or the user. It is also considered reliable if the brakes never fail, and the steering is consistently responsive. These mechanical functions, working together, make the car’s performance anywhere from zero to one hundred percent reliable. Similarly, the scores from tests or components of tests can also be characterized as being reliable when the tests provide the same results every time we administer them, regardless of the conditions under which they are administered. In other words, test scores should not fluctuate drastically as a result of the time of the test administration, the form of the test used (provided there exists more than one form), or the raters who might have scored the responses. This consistency of measurement is referred to as test reliability, and it ranges on a scale from zero (no consistency) to one (perfect consistency).
The quality of construct validity
The second quality that all ‘useful’ tests possess is construct validity. Bachman and Palmer (1996) define construct validity as ‘the extent to which we can interpret a given test score as an indicator of the ability(ies), or construct(s), we want to measure. Construct validity also has to do with the domain of generalization to which our score interpretations generalize’ (p. 21).
The quality of authenticity
Authenticity of content as the degree to which the topical, thematic or contextual characteristics of the test tasks match those of the TLU tasks.
The quality of interactiveness
This quality refers to the degree to which the aspects of the test-taker’s language ability we want to measure (e.g., grammatical knowledge, language knowledge) are engaged by the testtask characteristics (e.g, the input response, and relationship between the input and response) based on the test constructs. In other words, the task should engage the characteristics we want to measure (e.g., grammatical knowledge) given the test purpose, and nothing else (e.g., topical knowledge, affective schemata); otherwise, this may mask the very constructs we are trying to measure.
The quality of impact
Impact refers to the link between the inferences we make from scores and the decisions we make based on these interpretations. In terms of impact, most educators would agree that tests should promote positive test-taker experiences leading to positive attitudes (e.g., a feeling of accomplishment) and actions (e.g., studying hard). A special case of test impact is washback, which is the degree to which testing has an influence on learning and instruction. Washback can be observed in grammar assessment through the actions and attitudes that test-takers display as a result of their perceptions of the test and its influence over them. 
The quality of practicality
Test practicality is not a quality of a test itself, but is a function of the extent to which we are able to balance the costs associated with designing, developing, administering, and scoring a test in light of the available resources (Bachman, personal communication, 2002).
Overview of grammar-test construction
Stage 1:Design
Purpose
The purpose statement could also include who is impacted by the decisions and whether the stakes are high or low. It could also specify how the results of the test are intended to be used. For example, in most classroom tests the results of assessment will be used to promote further learning or to inform instruction.
TLU domains and tasks
After describing the purpose, the TLU domain is identified (e.g., real-life and/or language-instructional) and the TLU task types are selected. To identify language-use tasks and the type of language needed to perform these tasks, a needs analysis must be performed. This involves the collection and analysis of information related to the students’ target-language needs. Depending on the testing situation, a needs analysis can be relatively informal or very complex.
Characteristics of test-takers
The design statement contains a detailed description of the characteristics of the test-takers, so that the population of test-takers for whom the test is intended and to whom the test scores might generalize can be made explicit. The personal attributes of test-takers which can potentially influence test results include age, native language, gender, level of language ability and so forth.
Construct(s) to be measured
The design statement also provides a theoretical definition of the construct(s) to be measured in the test. Construct definition can be based on a set of instructional objectives in a syllabus, a set of standards, a theoretical definition of the construct or some combination of them all.
Construct definition based on a syllabus (or a textbook) is useful when we want to determine to what degree students have mastered the grammar points that have been taught during a certain period.
Plan for evaluating usefulness
The test design statement also provides a description of a plan for assessing the qualities of test usefulness. From the beginning of grammar-test development, it is important to consider all six qualities of test usefulness and to determine minimum acceptable levels for each quality. Bachman and Palmer (1996) suggest that a list of questions be provided to guide test developers to evaluate test usefulness throughout the process so that feedback can be provided. In addition, test developers should consider ways of providing empirical evidence of test usefulness (see Bachman and Palmer, 1996, Chapter 7, for a detailed list of questions to elicit information on the qualities of test usefulness).
Plan for managing resources
Finally, the test design makes explicit the human, material and time resources needed and available to develop the test. In cases of limited resources, priorities should be made in light of the qualities of test usefulness.
 
 
 
 
Stage 2: Operationalization
The operationalization stage of grammar-test development describe show an entire test involving several grammar tasks is assembled, and how the individual tasks are specified, written and scored. The outcome of the operationalization phase is both a blue print for the entire test including scoring materials and a draft version of the actual test. According to Bachman and Palmer (1996), the blue print contains two parts: a description of the overall structure of the test and a set of test-tasks pecifications for each task. (In this chapter the material don’t finish yet)
CHAPTER EIGHT: Learning-oriented assessment of grammatical ability
What is learning-oriented assessment of grammar?
In the context of learning grammar, learning-oriented assessment of grammar, is believed  among the educational assessment experts, that student learning would improve if assessment, curriculum and instruction were more connected.
In reaction to conventional test, many experts have been woring in new assessments techniques that better elicit students’ outcome and that better connect to classroom  goals, curricula and instruction. In this process we find alternative, authentic and performance assessment, all of them seems to be the same, but they have slightly differences. According to Purpura (2004) Alternative assessment encourage assessments in which students are asked to perform or produce meaningful tasks that need both higher level thinking and real world implication. Authentic assessment requires knowledge and skills where can be observed some real life or authentic tasks, to perform these tasks students need some complex and extended production, self-assessment is an important component of these tasks. Performance assessment refers to the evaluation of outcome, which is derived from the observation of  more complete tasks that implicates real life situation. Self-assessment is required by making explicit scoring in a rubric.
The objective of learning-oriented  assessment of grammar is to provide information about the grammar students know, understand or are able to use in different context, and the repercussion that this information might have for grammar processing; moreover, teachers can be provided with information about what students feel about learning grammar and about themselves as learners. In terms of method, learning-oriented assessment of grammar believe that assessment must be open to all task types, and this include the use of selected-response, limited-production and complex production tasks that may not involve real-life implication. Finally, learning-oriented assessment is designed to be an integral part of instruction. It can occur at formal or informal situation, at any stage of the learning process. The data can be collected at one point in time or over a period of time.
Implementing learning-oriented assessment of grammar. Consideration from grammar testing theory.
For implementing learning-oriented assessment of grammar some implication must be consider as design and also tests developers need to plan for and specify how assessment will be used to promote further learning.
Implication for the test design:
First consideration: in the design stage of a test constructions, classrooms teacher need to specify whom we are doing the assignment for, why assessment information is needed and what kind of information (essential information to specify assessment purpose).
Second consideration: construct definition. Learning-oriented assessment is designed to make and the feedback that can make result from observation of performance. 
Third consideration: the need to measure the students’ explicit as well as their implicit knowledge of grammar, but also the students’ implicit or internalized knowledge of the grammar.
Implication for operationalization:
The learning mandates will affect the specification of test task so can be better to align some characteristics with instructional goals. Learning oriented-assessment of grammar promotes the collection of data on students’ ability and methods in classroom, and also collects information about attitudes and feeling toward learning grammar. This data collection can be taken one point in time or accumulated over a period of time.
In classroom assessment design, the scoring process results in a written or oral evaluation of candidate responses. At the same time, this provides learners with summative or formative evaluationas for example feedback. Therefore, scoring process allows test-takers to discover themselves, positive and negative evidence on their grammatical ability. The information resulting from echievements test can provide much more meaningful and constructive guidance on what to notice and how to improve. Feedback and scoring method can involve students; this can develop their capacity for self-assessment, and and also develop responsibility of their own learning.
Planning for further learning: the test blueprint should include explicit informationon how the assessment plans to satisfy the learning mandate. Teachers have many options for presenting assessment results to students. They could present student with feedback, a score for each test component, scoring rubrics, narrative summary of teachers’ observation, etc.
Consideration from L2 learning theory
In implementing learning-oriented assessment of grammar, teachers need to consider how assessment relates to and can help promote grammar acquisition. This will affect not only what is and how is assessed, but also when in the lesson grammar knowledge are best assessed, and what the results mean for learners to improve.
SLA processes-briefly revisited:
Research in SLA suggest that learning an L2 involves three simultaneously process: Input processing: relates to how the learner understands the meaning of a new grammatical featuter or how form-meaning connection are made. System change: refers to how learners accommodate new grammatical forms to their interlanguage and how this change helps restructure their interlanguage. Output processing: relates to how learners access or make use of implicit grammatical knowledge to produce utterances spontaneously in real time.
Assessing for intake: 
this process is describe as the first critical stage of acquisition, as the process of converting input into intake. Students are given a communicative language classroom and are encouradged to use tasks in which they must use language meaningfully and use comprehensible input as an essential component of instruction. Assessing for intake requires learners understand the target forms, but do not produce them themselves. This can be achieved by selected-response and limited-production tasks in which learners need to make form-meaning connections.
Assessing to push restructuring:
Once input has been converted into intake, the new grammatical features is ready to be accommodated into the learners’ developing linguistic  system. To initiate this process, teaches provide them with tasks that enable them to use the new grammatical forms in decreasingly controlled situation. By attending to grammatical input and by getting feedback learners are able to accommodate the differences between their interlanguage and the target language.
Assessing for output processing:
Even though learners have showed explicit knowledge of form and meaning of new grammatical points, it does not mean they can access this knowledge automatically in spontaneous communication. Learners need to be able to produce unplanned, meaningful output in real time showing that the grammatical knowledge is already unconscious part of their developing system of language knowledge.
CHAPTER NINE 
1. Challenge 1
a. Defining grammatical ability 
At the moment of assessing grammatical form and meaning communicative language testing. It is relevant to provide teachers and learners with a more complete assessment, taking into account the grammatical ability of the takers than just providing information about the form of the meaning. 
b. Theoretical challenges about the definition of grammatical knowledge
It is concerned to language educators who need to make comprehensible distinctions between the form and meaning components of grammatical knowledge in terms of the test purpose in order to integrate these distinctions in construct definition.

2. Challenge 2
a. Scoring grammatical ability
It is related to scoring form, meaning and grammar assessments and also how language teachers need to adapt their scoring procedures to reflect the two dimension of grammatical knowledge. It requires the use of measurement models to contain dichotomous and partial- credit data in analysing test scores.
In scoring extended-production tasks descriptors, rubrics must be adapted to grade performance in form and meaning more noticeably.
b. Advantages and disadvantages
“The advantages of using  complex performance tasks that are highly authentic is the generalizabilityn of theinferences these tasks allow us to make about grammatical ability”. (p.259)
The disadvantages are related to the lack of accuracy with which teachers are able to infer what students or test takers know about grammar, taking into other constructs that could be intended or no measured in such tasks by raters.

3. Challenge 3
a. Assessing maenings
It is concerned about the meaning in a model of communicative language ability can be defined and assessed.
b. The assessment of meaning in terms of grammatical meaning and pragmatic meaning
- The primary goal in grammar assessment is to notice if students are able to use forms to acquire their basic point across correctly and significantly. If meaning is construct-relevant, as a result communicative meaningfulness should be scored.
- Pragmatic meaning involves an amount of implied meanings that originate from context relating to the interpersonal relationship of the interlocuters.
- The distinctions between grammatical meaning and pragmatic meanings are observable when L2 students fail at the moment of understanding how meanings could be extended or intentionally confusing.
4. Challenge 4
a. Reconsidering grammar-test tasks 
It is related to the design of test tasks that are able to measure grammatical ability and provide authcentic and engage measures of grammatical performance.
To design tasks that are authentic and engaging measures of performance, it is necessary to consider to consider:
- The assessment purpose and the construct that is going to be measure.
- To consider the kinds of grammatical performance required in order to provide evidence in support of the inferences.
- After the inferences are specified, it is required to support these claims to design test tasks to measure what students know about grammar or how they are able to use grammatical resources to accomplish a wide range of activities in the target language.
5. Challenge 5
a. Assessing the development of grammatical ability 
“The challenge for language testers is to design, score and interpret grammar assessments with a consideration for developmental proficiency” (Purpura, 2005; p. 273)
Ellis (2001) states that grammar scores should be calculated to provide a measure of grammatical accuracy and the underlying acquisitional development of L2 students.
- With limited or extended production tasks. Teachers can give learners credit for and feedback by judging performance on these tasks by means of analytic rating scales.
- Rating scales need to be based on construct and task based methods in which the different level of grammatical abilities can be described completely.

REFENRENCES:
Purpura. James. E. 2004. ASSESSING GRAMMAR (p. 1-251). Cambridge University Press: United Kingdom. 
https://www.slideshare.net/mobile/igotamnesia/assessing-grammar-summary-ch-1-8-9

PERTEMUAN 14 ASSESSING READING AND ASSESSING WRITING (CHAPTER 8 & 9)

A. ASSESSING READING 
Reading, the most essential skill for success in all eduacational context, remains a skill of paramount importance as we create assessment of general language ability.
Two primary hurdles must be cleared in order to become efficient readers:
a. Be able to master fundamental bottom up strategies for processing separate letters, words and phrases, as well as top-down, conceptually driven strategies for comprehension.
b. As part of the top-down a approach, second language readers must develop appropriate content amd format schemata¬- background information and cultural experience- to carry out those interpretations effectively.
The assessment of reading ability does not end with the measurement of comprehension. Strategic pathways to full understanding are often important factors to include in assessing learners, especially in the case of most classroom assessment that are formative  in nature.
All assessment of reading must be carried out by inference.
GENRES OF READING
1. Academic Reading
General interest articles (in magazines, newspaper)
Technical reports (e.g., lab raports), professional jurnal articles
Reference material (dictionaries)
Textbooks, theses
Essays, papers
Test directions
Editorials and opinion writing
2. Job-related Reading
Messages (e.g., phone messages)
Letters / emails
Memos (e.g., interoffice)
Reports (e.g., job evaluations, project reports)
Schedules, labels, signs, announcements
Forms, applications, quetionnairs
Financial documents (bills, invoices)
Directories (telephone, office)
Manuals, directions
3. Personal Reading
Newspaper and magazine
Letters, emails, greetings cards, invitations
Messages, notes, lists
Schedules (train, bus, plane)
Recipes, menus, maps, calenders
Advertisements, (commercials, want ads)
Novels, short stories, jokes, drama, poetry
Financial documents (e.g., checks, tax forms, loan applications)
Forms, quetionnaries, medical reports, immigration documents
Comic strips, cartoons
Importance of Genres of Reading
It enables the readers to apply certain schemata that will assist them in extracting appropriate meaning
Efficient readers have to know what their purpose is in reading a text, the strategies for accomplishing that purpose and how to retain the information.
Microskills for Reading 
1. Discriminate among the distrinctive graphemes and orthographic patterns of English.
2. Retain chunks of language of different lengths in short term memory.
3. Process writing at an efficient rate of speed to suit the purpose.
4. Recognize a core of words, and interpret word order patterns and their significance.
5. Recegnize grammatical word classes (nouns, verbs), systems (tense, agreement, and pluralization), patterns, rules and elliptical forms.
6. Recognize that a particular meaning may be expressed in different grammar forms.
7. Recognize cohesive devices in written discourse and their role in signalling the relationship between and among clauses.
Macroskills for Reading
1. Recognized the rhetorical forms of written discourse and their significance for interpretation.
2. Recognize the communicative functions written texts, according to form and purpose.
3. Infer context that is not explicit by using background knowledge.
4. From described events, ideas, etc., infer links and connections between events, deduce causes and effects and detect such relations as main idea, supporting idea, new information, given information, generalization and exemplification.
5. Distinguish between literal and implied meaning.
6. Detect culturally specific references and interpret them in a context of the appropriate cultural schemata.
7. Develop and use a battery of reading strategies such as scanning and skimming, detecting discourse markers.
Some Principles Strategies for Reading Comprehension
1. Identify your purpose in reading a text.
2. Apply spelling rules and conventions for bottom-up decoding.
3. Use lexical analysis (prefixes, roots, suffixes, etc.) to determine meaning.
4. Guess at meaning (of words, idioms, etc.) when you aren’t certain.
5. Skim the text for the gist and for main ideas.
6. Scan the text for specific information (names, dates, key words).
7. Use silent reading techniques for rapid processing. 
8. Use marginal notes, outlines, charts, or semantic maps for understanding and retaining information.
9. Distinguish between literal and implied meanings.
10. Capitalize on discourse markers to process relationship.
TYPES OF READING
Perceptive 
- Involve attending to the components of larger stretches discourse: letters, words, punctuation and other graphemic symbols. 
- Bottom-up processing is implied.
Selective 
- This is largerly an artefact of assessment formats.
- Cerain typical tasks are used such as picture-cued tasks, matching, true /false, multiple choice.
- Stimuli include sentences, brief paragraphs and simple charts and graphs.
- Brief responses are intended and a combination of bottom-up and top-down processing may be used. 
Interactive 
- Include stretches of language of several paragraph to one page or more which the reader must interact with the text.
- Genres: anecdotes, short narratives and descriptions, excerpts from longer texts, questionarries, memos, announcements, directions, recipes and the like.
- Focus: to identify relevant features (lexical, symbolic, grammatical and discourse) within texts of moderately short length with the objective of retaining the information that is processed.
Extensive
- It applies to texts of more than page, up to and including professional articles, essays, technical reports, short stories and books.
- Purpose: to tap into a learner’s global understanding of a text, as opposed to asking test-takers  to “zoom in” on small details.
- Top-down processing is assumed for most extensive tasks.
Designing Assessment Tasks: Perceptive Reading
At the beginning level of reading a second language lies a set of tasks that are fundamental and basic: recognition of alphabeticsymbols, capitalized and lowercase letters, punctuation, words and grapheme-phoneme correspondences.
LITERACY tasks: implying that learner is in the early stages of becoming “literate” 
READING ALOUD
The test takers sees separate letters, words and/or short sentences and reads them aloud, one by one, in the presence of an administrator.
Any recognizable oral approximation of the target response is considered correct.

WRITTEN RESPONSE
The same stimuli is presented, and the test taker’s task is to reproduce the probe in writing.
Evaluation of the test taker’s response must be carefully treated.

MULTIPLE CHOICE
1. Grapheme recognition task.
2. Minimal pair distinction

PICTURE-CUED ITEMS
Test-takers are shown a picture along with a written text and are given possible tasks to perfom.
1. Picture-cued matching word identification
Designing Assessment Tasks:Selective Reading
Selective Reading
Focus on formal aspect of language (lexical,grammatical and a few discourse features)
It includes what many incorrectly think of as testing “vocabulary and grammar “

MULTIPLE CHOICE

1. Multiple-choice vocabulary /grammar tasks
2. Contextualized multiple-choise vocabulary/grammar tasks
3. Multiple-choise vocabulary /grammar tasks

MATCHING TASKS
1. Vocabulary matching task
2. Selected response fill-in vocabulary task
3. Matching task
ADVANTAGES DISADVANTAGES
It offers an alternative to traditional multiple-choice or fill in the blank formats and are easier to construct than multiple choice item. It become more of a puzzle-solving process than a genuine test of comprehension as test-taker strungle with the search for a match.
 
EDITING TASKS
Editing for grammatical or rhetorical errors is a widely used  test method for assessing linguistic competence in reading.
It does not only focus on grammar but also introduces a stimulation of the authentic task of editing or discerning errors in written passages.

PICTURE-CUED TASKS
Diagram-labeling task 

GAP-FILLING TASKS
The response is to write a word or phrase. 
To create sentence complement items where test-takers read part of a sentence and then complete it by writing a phrase.
- Sentence Completion Task
Gap Filling Task
DISADVANTAGES
           It has a question assessment of reading ability. The task requires both reading and writing performance, thus, rendering it of low validity in isolating reading as the sole criterian.
         Scoring the variety of creative responses that are likely to appear is another drawback. A number of judgment is needed on what comprises a correct response.


Designing Assessment Tasks:
Interactive Reading 
Tasks at this level have a combination of form-focused and meaning-focused objectives but with more emphasis on meaning.
It implies a little more focus on top-down processing than on bottom-up.
Texts are a little longer from a paragraph to as much as a page or so in the case of ordinary prose. Charts, graps and other graphics are somewhat complex in their format.
- CLOZE TASKS
The ability to filling gaps in an incomplete image (visual, auditory or cognitive) and supply (from background schemata) omitted details.
Cloze tests are usually a minimum of two paragraph in length in order to account for discourse expectancies.
Typically, ever seventh word (plus or minus two) is deleted (known as fixed-ratio deletion) but many cloze test designers instead use a rotation deletion procedure of choosing deletions according to the grammatical or discourse function of the words.
Two approaches to the scoring of cloze test
Exact word method-gives credit to test-takers only if they insert the exact word that was originally deleted.
Appropriate word method- gives credit to the test-takers for supplying any word that is grammatically correct and that makes good sense in the context.
1. Cloze procedure, fixed ratio deletion  (every seventh word)
2. Cloze procedure, rational deletion (prepositions and conjunctions)
Variations on Standard Cloze Testing
C-test- the second half (according to the number of letters) of every other word is obliterated and the test-taker must restore each word.
Cloze-elide procedure¬- it insert words into a text that do not belong. The test-taker’s task is to detect and cross out the “intrusive” words.
Cloze-elide procedure is actually a test of reading speed and not of proofreading skill.
DISADVANTAGES
Neither the words to insert nor the frequency of insertion appears to have any rationale. 
Fast and efficient readers are not adept at detecting the instrusive words. Good readers naturally weed our such potential interruptions.

- IMPROMPTU READING PLUS COMPREHENSION QUESTIONS
The traditional “Read a passage and answer same questions” technique which is the oldest and the most common. Ex: Reading comprehension passage (Phillips, 2001, pp. 421-422) and Computer-based TOEFL* reading comprehension item.

- SHORT-ANSWER TASKS
A reading passage is presented and the test taker reads questions that must be answered in sentence or two.
1. Open-ended reading comprehension questions
- EDITING (LONGER TEXTS)
ADVANTAGES
Authenticity is increased.
The task simulates proofreading one’s own essay, where it is imperative to find and correct errors. 
If the test is connected to a specific curriculum, the test designer can draw up specifications for a number of grammatical and rhetorical categories that match the content of the courses.


SCANNING
It is a strategy used by all readers to find relevant information in a test.
Test-takers are presented with a text (prose or something in a chart or graph format) and requiring rapid identification of relevant bits of information. 
- Possible stimuli include:
- A one to two page news article
- An essay
- A chapter in a textbook
- A technical report
- A table or chart depicting some research findings
- An application form
Test-taker must locate
- A date, name or place in an article;
- The setting for a narrative story;
- The principal divisions of a chapter;
- The principal research finding in a technical report;
- A result reported in a specified in a table;
- The cost of an item on a menu; and
- Specified data needed to fill out an application.

ORDERING TASKS
- Sometimes called “strip story ” technique.
- Variations on this can serve as an assessment of overall global understanding of a story and of the cohesive devices that signal the order of events or ideas.
Sentence-ordering
INFORMATION TRANSFER: READING CHARTS, MAPS, GRAPHS, DIAGRAMS
- It requires not only an understanding of the graphic and verbal conventions of the medium but also a linguistic ability to interpret the information to someone else.
- It is often accompanied by oral or written discourse in order to convey, clarify, question, argue and debate, among other linguistic functions.
INFORMATION TRANSFER: READING CHARTS, MAPS, GRAPHS, DIAGRAMS
- To comprehend information in this medium, learners must be able to:
comprehend specific conventions of the various types of graphics;
comprehend labels, headings, numbers and symbols;
comprehend the possible relationship among elements of the graphic; and 
make inferences that are not presented overtly.
The act of comprehending graphics includes the linguistic performance of oral or written interpretattions, comments, questions, etc. this implies a process of information transfer from one skill to another, in this case, from reading verbal/nonverbal information to speaking /writing.
Designing Assessment Tasks:
Extensive Reading
- It involves somewhat longer texts. Journal articles, technical reports, longer essays, short stories and books fall into this category.
- Reading of this type of discourse almost always involves a focus on meaning using mostly top-down processing, with only occasional use of targeted bottom-up strategy.
Tasks that can be applied in extensive reading:
- Impromptu reading plus comprehension questions
- Short answer tasks 
- Editing 
- Scanning 
- Ordering 
- Information transfer and 
- Interpretation (discussed under graphics)
SKIMMING TASKS
It is the process of rapid coverage of reading matter to determine its gist or main idea.
It is a prediction strategy used to give a reader a sense of topic and purpose of text, the organization of the text, the perspective or point of view of the writer , its case or difficulty and its usefulness to the reader.
SUMMARIZING AND RESPONDING
SUMMARIZING
It requires a synopsis or overview of the text.
RESPONDING
It asks the reader to provide his/her own opinion on the text as a whole or on some statement or issue within it.
Scoring is also difficult in responding because of the subjectivity
Holistic Scoring scale for summarizing and responding
3            Demonstrate clear, unambiguous comprehension of the main and supporting ideas.
2            Demonstrates comprehension of the main idea but lacks comprehension of some supporting ideas 
1            Demonstrates only a partial comprehension of the main and supporting ideas.
0            Demonstrates no comprehension of the main and supporting ideas.

NOTE-TAKING and OUTLINING
- They fall on the category of informal assessment 
- Their utility is in the strategic training that learning gain in retaining information through marginal notes that highlight key information or organizational outlines that put supporting ideas into a visually manageable framework.

B. ASSESSING WRITING 
Genres of Writing
- 1. Academic writing
- Papers and general reports/ essays, compositions /academically /focused journals /theses /dissertations
- 2. Job-related writing 
- Message /letters, email /memos /reports /labels /signs / advertisements /announcements.
- 3. Personal writing 
- Greeting cards /invitations /notes /tax /forms /diaries /fiction /personal journal
Types of Writing Performance
- 1. Imitative. It is a level at which learners are trying to master the mechanics of writing. Form (letters, words, punctuation, and brief sentences) is the primary while context and meaning are of secondary concern.
- 2. Intensive. It includes skills in producing appropriate vocabulary, collocations, idioms, and correct grammatical features. Most assessment tasks are concerned with a focus on form.
- 3. Responsive. At this level, form-focused attention is mostly at the discourse level, with a strong emphasis on context and meaning.  Assessment tasks require leraners to connect sequence of two or three paragraphs.
- 4. Extensive. Extensive writing using all the processes and strategies of writing to write an essay, a term paper, a project report, or even a thesis.
Imitative Writing 
- 1. Copying. There is nothing innovative or modern about directing a test-taker to copy letters or words.
- Ex: (Copy the words)
- Bit bet but gin din pin
- ____ ___ ____ ___ ____ ____
- Listening cloze selection tasks Test takers listen a passage and then write the missing words (p. 222)
- 3. Picture-cued tasks. Familiar pictures are displayed, and test-takers are told to write the word that the picture presents.
- 4. Form completion tasks. The use of a simple form (registration, application, etc.) that asks for name, addres, phone number, and other data.
- 5. Converting numbers and abbreviations to words.
- 9:00 ______      5:45______
- Tues.______     5/3  ______
- 726 S. Main St. _________
Spelling Tasks and Detecting Phoneme-Grapheme Correspondences
- Spelling tests
- Picture-cued tasks
- Multiple-choice techniques
- Example:
- He washed his hands with _____
- A. soap B. sope C. sop D. soup
- Matching phonetic symbols
- d/e/ ______    I /ai/ /k/ _____
Intensive Writing 
- The same as controlled writing or guided writing. At this level, students produce language to display their competence in grammar, vocabulary or sentence formation, and not necessarily to convey meaning for an authentic purpose.
- Dictation and Dicto-Comp
- Dictation is the rendition in writing of what one hears aurally.
- Dicto-Comp: A paragraph is read at normal speed, usually two or three times; then the teacher asks student to rewrite the paragraph from the best of their recollection.
- Variation: The teacher, after reading the passage , distributes a handout with key words from the paragraph as cues forstudents.
Grammatical Transformation Tasks
- The tasks are:
- Change the tenses in a paragraph.
- Change full forms of verbs to reduced forms.
- Change statements to yes/no or WH-questions.
- Change questions into statements.
- Combine two sentences into one using a relative pronoun.
- Change from active to passive voice.
Picture-Cued Tasks
- Short sentences. A drawing of some simple action is shown; the test-takers writers a brief sentence. (p. 227)
- Picture description. Using the prepositions: on, over, under, next to, around to describe as in a picture on p. 192.
- Picture sequence description. A sequence of three to six pictures depicting a story line can provide a suitable stimulus for written task. (p. 228)
- Vocabulary Assessment Tasks
- The major techniques used to assess vocabulary are (a) defining and (b) using a word in a sentence. 
- Ordering tasks
- Recording words in a sentence
- 1. Cold /winter /is /weather /the /in /the
- 2. Studying /what /you /are
- 3. Next /clock /the /is /picture /to
Short-Answer and Sentence Completion Tasks
- Ex: 
- 1. A: Who’s that? B: ______Gina.
- A: Where’s she from? B: _______Italy.
- 2. Write three sentences describing your preferences: #6a: a big, expensive car or a small, cheap car; #6b: a house in the country or an apartment in the city; #6c: money or good health.
- 6a._______ 6b._______ 6c. ______
Issues in Assessment responsive and Extensive Writing
- The genres of text here are:
- Short reports/responses to the reading of an article or story/summarize of articles or stories/brief narratives or descriptions/interpretations of graphs, tables, and charts.
- Writers become involved in composing, real writing, as opposed to display writing.
- 1. Authenticity. Assessment is typically formative, not summative, and positive washback is more important than practicality and reliability.
- 2. Scoring. Not only the form but also the function of the text are important in evaluation.
- 3. Time. Responsive writing, along with extensive writing, relies on the essential drafting process for its ultimate success.
Designing: Response and Extensive Writing
- Paraphrasing. It is to say something in one’s own words. It can avoid plagiarizing and offer some variety in expression. Scoring is judged by how the test-taker conveys the same of similar message, with discourse, grammar, and vocabulary as secondary evaluations.
- Guided Question and Answer
- The test administrator poses a series of questions which serve as an outline of the emergent written text.
Guided Written Stimuli
- 1. Where did this story take place? (setting)
- 2. Who were the people in the story?
- 3. What happened first? And then? And then?
- 4. Why did ________do________(reason)?
- 5. What did _____think about_____? (opinion)
- 6. What happened at the end? (climax)
- 7. What is the moral of the story? (evaluation)
Paragraph Conruction tasks
- Assessment of paragraph development takes on the following forms.
- 1. Topic sentence writing. The writing of a topic sentence (its preference/absence, its effectiveness in starting the topic).
- 2. Topic development within a paragraph . four criteria to assess the quality:
- The clarity of expression of ideas /the logic of the sequence and connections /the cohesiveness or unity /the overall effectiveness or impact.
- 3. Development of main and supporting ideas across paragraph. The elements in evaluating a multi-paragraph essay:
- Addressing the topic, main idea, or principal purpose.
- Organizing and developing supporting ideas.
- Using appropriate details to undergrid supporting ideas.
- Showing facility and fluency in the use of language.
- Demonstrating syntactic variety.
Strategic Opinion
- 1. Attending to task. A set of directives is stated or implied by the teacher or the conventions of the genre. Four types: compare/contrast, problem/solution, pros/cons, and cause/effect.
- 2. Attending to genre. Reports, Summaries of reading/lectures/videos, Responses to reading/lectures/videos, Narration, description, persuasion/argument, and exposition, Interpreting statistical, graphic data, Library research paper.
Test os Written English (TWE)
- TWE is a standardized test of writing ability and gained a reputation as well- respected measure of written English.
- The TWE is a timed impromptu test in which test-takers are under a 30-minute time limit and are not able to prepare ahead of time.
Sample TWE Topic
- Some people say that the best preparation for life is learning to work with others and be cooperative. Others take the opposite view and say that learning to be competitive is the best preparation. Discuss these positions, using concrete examples of both. Tell which one you agree with explain why.
Six Steps to Maximize success on the TWE 
- 1. Carefully identify the topic.
- 2. Plan your supporting ideas.
- 3. In the introductory paragraph, restate the topic and state the organizational plan.
- 4. Write effective supporting paragraphs. 
- 5. Restate your position and summarize in the concluding paragraph.
- 6. Edit sentence structur & rhetorical expression. (Scoring Guide p. 239)

Scoring Methods for Responsive and Extensive Writing 

- There major approaches to scoring writing  performance : holistic, primary trait, and analytical.
- Holistic : A single score is assigned to an essay. 
- Primary trait : The achievement of the primary purpose, or trait, of and essay is the only factor rated. 
- Analytical : The written teks is broken down  into a number of subcategories (organization, grammar) and each subcategori gets a separate rating
Holistic Scoring
- Advatages :
- Fast evaluation /high inter - rater reability/
- easily  interpreted by lay persons / empahasize the writer’s sterngths / applicability to writing across many different discipilines 
- Disadvantages :
- No diagnostic information/ not aqually well apply to all genres / training in raters / one score only 
Primary Trait Scoring 
- If the purpose or function of an essay is to persuade the reader to do something, the score for the writing would rise or fall on the accomplishment of that function.
- Organization, supporting details, fluency, syantactic variety, and other features will also be evaluated. 
- Advantage : focus on function.
Analytic Scoring
- Classroom evaluation of learning is best through analytic scoring. 
- Brown and Bailey (1984) designed an analytic scoring scale that specified five major categories and five different levels in each category, ranging from “unacceptable” to “excellent”.
- Five categories: organization, logical development of ideas, grammar, punctuation /spelling/ mechanics, and quality of expression. (pp.244-245)
Continue 
- Content          30
- Organization   20
- Vocabulary   20
- Syntax   25
- Mechanics   5
- Total       100
- Analytic  scoring offers more washback  and helps to call the writer’s attention to problem areas, but it requires more time for teachers to attend to details within each of the categories
REFERENCE:
Brown. H.D. 2004. LANGAUGE ASSESSMENT: Principle and Classroom Practices: Pearson.
- Assessing Reading (Chapter 8, p. 185-217)
- Assessing Writing (Chapter 9, p. 218-250)
https://www.slideshare.net/mobile/venj88/assessing-reading-hd-brown-handout. Accessed on 8 May  2020
http://www.slideshare.net/mobile/kheangsokheng52/chapter-9-assessing-writing. Accessed on 10 May 2020

PERTEMUAN 13 ASSESSING LISTENING AND ASSESSING SPEAKING (CHAPTER 6 & 7)

PERTEMUAN 13  
ASSESSING LISTENING AND ASSESSING SPEAKING 
(CHAPTER 6 & 7)
A. ASSESSING LISTENING
DEFINITION
The nature of listening (Fang: 2008), means that the learner should be encouraged to concentrate on an active process of listening for meanings. Using not only the linguistics cues, but his non linguistic knowledge as well.
In the modern view of listening, O’Malley and Chamot (1989) defines that listening comphrehension is actual and couscious process which the listener constructs meaning by using cues from contextual information and existing knowledge while relying upon multiple strategic resources to fulfill the task requirement.
LANGUAGE SKILLS: 
1. Listening 
2. Speaking
3. Reading
4. Listening
Listening is receptive skills. The importance of listening:
1. Often implied as a component of speaking.
2. Input in the successful of language acquisition.
3. Applicable in many fields workplace, education, home-context.
Assessment of listening must be made because we neither observe the actual act of listening nor the product.
WHAT MAKES LISTENING IS SO DIFFICULT?
Clustering:
Redundancy
Reduced forms 
Performance variables
Colloquial language 
Rate of delivery
Stress, rhythm, and intonation
Interaction
BASIC TYPES OF LISTENING
Heaton (1988) argued that developing skills can be done through testing listening comprehension test.
He, therefore, listed the two categories of auditory test:
1. Test of phoneme discrimination and of sensitivity to stress and intonation.
2. Test of listening comprehension.
Brown (2004) stated that effective test or appropriate assessment designing must begun with the specification of objectives or criteria which can be classified on several types of listening performance.
THE PERFORMANCE OF LISTENING

MACRO
Extensive listening: developing the gist, a global/comprehensive understanding.
Selective listening: determining meaning of auditory input.
Responsive listening: understanding pragmatic context. MICRO
Intensive Listening: comprehending language structure elements.

DESIGNING ASSESSMENT TASKS: INTENSIVE LISTENING
Phonological and morphological elements recognition
Phonemic pair, consonant
Test-takers hear: He’s from California.
Test-takers read: (a) He’s from California.
                                (b) She’s from California.

Phonemic pair, vowels
Test-takers hear: Is he living?
Test-takers read: (a) Is he leaving?
                                (b) Is he living

Morphological pair, -ed ending
Test-takers hear: I missed you very much.
Test-takers read: (a) I missed you very much.
                                (b) I miss you very much.

Stress pattern in can’t 
Test-takers hear: My girlfriend can’t go to the party.
Test-takers read: (a) My girlfriend can’t go to the party.
                                (b) My girlfriend can go to the party.

One-word stimulus
Test-takers hear: vine
Test-takers read: (a) vine
                                (b) wine

Paraphrase recognition
Sentence paraphrase
Test-takers hear: Hello, my name’s Keiko. I come from Japan.
Test-takers read: (a) Keiko is confortable Japan.
                                (b) Keiko wants to come to Japan.
                                (c) Keiko is Japanese.
                                (d) Keiko likes Japan. 

Dialogue paraphrase
Test-takers hear:
Man: Hi, Maria, my name’s George.
Women: Nice to meet you, George. Are you American?
Man: No, I’m Canadian.
Test-takers read: (a) George lives in the United States.
                                (b) Goerge is American.
                                (c) George comes from Canada.
                                (d) Maria is Canadian.

DESIGNING ASSESSMENT TASKS: RESPONSE LISTENING
Appropriate response to question
Test-takers hear: How much time did you take to do your homework?
Test-takers read: (a) in about an hour.
                                (b) about an hour.
                                (c) about $10.
                                (d) Yes, I did.

Open-ended response to a question
Test-takers hear: How much time did you take to do your homework?
Test-takers write/speak:__________________________________.

DESIGNING ASSESSMENT TASKS: SELECTIVE LISTENING
Listening cloze
Test-takers hear:
Ladies and gentlemen. I now have some connecting gate information for those of you making connectings to other flights out of San Francisco.

Flight seven-oh-six to Portland will depart from gate seventy-three at nine-thirty p.m
Flight ten-forty-five to Reno will depart  at nine-fifty  p.m. from gate seventeen.
Flight four-forty to Sacramento will depart at nine-thirty-five p.m. from gate sixty.
And flight sixteen-oh-three to acramento will depart from gate nineteen at ten-fifteen.

Test-takers write the missing or phrases in the blanks.

Information transfer
 

Information transfer: single-picture-cued verbal multiple-choice
Test-takers see: a photograph of a woman in a laboratory setting, with no glasses on, squinting through a microscope with her right eye closed.

Tets-takers here: (a) She’s speaking into a microphone.
                                (b) She’s putting on her glasses.
                                (c) She’s has both eyes open.
                                (d) She’s using a microphone.

Information transfer: chart-filling
Tets-takers hear:
Now you will hear information about Lucy’s schedule. The information will be given twice. The first time just listen carefully. The second time, there will be pause after each sentence. Fill in Lucy’s blank daily schedule with the correct information. The example has already been filled in.

You will hear: Lucy gets up at eight o’clock every morning except on weekends.
You will fill in the schedule to provide the information.
Noe listen to the information about Lucy’s schedule. Remember, you will first hear all the sentence; then you will hear each sentence separately with time to fill in your chart.

Lucy gets up at eight o’clock. She has History on Tuesday and Thursday at two o’clock.
She takes Chemistry on Monday from two o’clock to six o’clock. She plays tennis on weekends at four o’clock. She eats lunch at twelve o’clock every day except Saturday and Sunday. 

Now listen a second time. There will be pause after each sentence to give you time to fill in the chart. 

Test-takers see the following weekly calender grid:
Monday Tuesday Wednesday Thursday Friday Weekends
8:00 Get up Get up Get up Get up Get up
10:00
12:00
2:00
4:00
6:00


Sentence repetition
Sentence repetition is far from flawless listening assessment task. This task may test only recognition of sounds, and it can easly be contaminated by lack of short-term memory ability, thus invalidating it is as an assessment of comprehension alone. And the teacher may never be able to distinguisha listening comprehension error from an oral production error.

DESIGNING ASSESSMENT TASKS: EXTENSIVE LISTENING 
First reading (natural speed, no pauses, test-takers listen for gist):

The state of California has many geographical areas. On the western side is the Pacific Ocean with its beaches and sea life. The central part of the state is a large fertile valley. The southeast has a hot desert, and north and west have  beautiful mountains and forests. Southearn California is a large urban populated by millions of people. 

Second reading (slowed speed, pause at each // break, test-takers write):

The state of California has many geographical areas. // On the western side // is the Pacific Ocean // with its beaches and sea life. // The central part of the state // is a large fertile valley. // The southeast has a hot desert, // and north and west // have  beautiful mountains and forests. // Southearn California // is a large urban // populated by millions of people. 

Third reading (natural speed, test-takers check their work).

Communication stimulus-response tasks
Dialogue and multiple choice comphrehension items
Test-takers hear:

Direction: Now you will hear a conversation between lynn and her doctor. You will hear the conversation two times. After you hear the conversation the second times, choose the correct answer for question 11-15 below. Mark your answer on the answer sheet provided.

Doctor: good morning, Lynn. What’s the problem?
Lynn: well, you see, I have a terrible headache, my nose is running, and I am really dizzy.
Doctor: Okay. Anything else?
Lynn: I’ve been coughing. I think I have a fever, and my stomach ache.
Doctor: I see. When did this start?
Lynn: Well, let’s see, I went to the lake last weekend, and after I returned home I started sneezing.
Doctor: Hmm. You must have the flu. You should got lots of rest, drink hot beverages, and stay warm. Do you follow me?
Lynn: Well, uh, yeah, but . . . shouldn’t I take a medicine?
Doctor: Sleep and rest as good as medicine when you have flu.
Lynn: Okay, thanks, Dr. Brown.

Test-takers raed:
11. What is Lyyn’s problem?
A. She feels horrible
B. She run too fats at the lake.
C. She’s been drinking too manmy hot beverages.
12. When did Lynn’s problem start?
A. When she saw her doctor.
B. Before she went to tha lake.
C. After she came home after the lake.
13. The doctor said that Lynn _____.
A. Flew to the lake last weekend.
B. Must not get the flu.
C. Probably has the flu.
14. The doctor told that Lynn ____.
A. To rest.
B. To follow him.
C. To take some medicine.
15. According to the Dr. Brown, sleep and rest are _____ medicine when you have the flu.
A. More effective than.
B. As effective as.
C. Less effective than.

Dialogue and authentic question on details
Tets-takers hear:
You will hear a conversation between detective and a man. The tape will play the conversation twice. After you hear a conversation the time, choose a correct answer on your test sheet.

Detective: Where were you last night at eleven p.m., the time of the murder?
Man: Uh, let’s see, well, I was just starting to see a movie.
Detector: Did you go alone?
Man: No, uh, well, I was with my friend, uh, Bill, yeah, I was with Bill.
Detective: What did you do after that?
Man: We went out to dinner, then I dropped her off at her place.
Detective: Than you went home?
Man: yeah.
Detective: When did you get home?
Man: A little before midnight.

Test-takers read:
7. Where was the man at 11:00 p.m?
A. In a restaurant.
B. In a theatre.
C. At home.
8. Was he with someone?
A. He was alone.
B. He was with his wife.
C. He was with a friend.
9. Then what did he do?
A. He ate out.
B. He made dinner.
C. He went home.
10. When did he get home?
A. About 11:00.
B. Almost 12:00.
C. Right after the movie.
11. The man is probably lying because (name two clues):
1. ________________________________________
2. ________________________________________

Authentic listening tasks
a. Note-taking 
This is usually done by the students of non-native English users while listening to the classroom lecturers by professors. Their notes will be evaluated in such a way so that it lacks some reliability.
b. Editing
Test-takers raed: the written stimulus material (a news sport, an email from friend, notes from a lecture, or an editorial in a newspaper),
Test-takers hear: a spoken version of the stimulus that deviates, in a finite number of factsor opinions, from the original written from.
Test-takers mark: the written stimulus by circling any words, phrases, facts, or opinions that show a discrepancy between the two versions.
c. Interpretive tasks
The tasks extend the stimulus material to a longer stretch of discourse and forces the test-takers to infer a response by answering a few questions in the open-ended form. The potential stimuli can be used are song lyrics, poetry (recite), radio/tv news reports and oral account of experience.
d. Retelling 
The tase-takers listen to a story or news event and simply retell it or summarize orally or written. In order to show a full comprehension, the test tajers require to identify the main idea, purpose and supporting details.

B. ASSESSING SPEAKING 
Basic Types of Speaking 
- Imitative. It is simple the ability to parrot back a word or phrase or a sentence. 
- Intensive. It is the production of short stretches of oral language. Ex: include directed response tasks, reading aloud, sentence and dialogue completion, limited picture-cued tasks.
- Responsive. The tasks include interaction and test comprehension but at the limited level of short conversations, standard greetings, small talk, requests, and comments. 
- Interactive. The length and complexity of the interaction are more in interactive tasks than in responsive ones. The task sometimes includes multiple exchanges and/or multiple participants.
- Extensive. (monologue) The tasks include speeches, oral presentations, and story-telling. Oral interaction from listeners is either highly limited or ruled out altogether.
Assessment Tasks: Imitative Speaking
- Word repetition task 
- Test-takers hear:
- beat/bit bat/vat
- I bought a boat yesterday.
- The glow of the candle is growing.
- Test-takers repeat the stimulus.
Scoring scale for repetiotion tasks
- 2 acceptable pronounciation.
- 1 comprehensible, partially correct.
- 0 silence, seriously incorrect.
Phonepass Test
- It elicit s computer-assissted oral production over a telephone. Test-takers read aloud, repeat sentences, say words, and answer questions.
- Part A: Test-takers read aloud selected sentences. 
- Ex: Traffic is a huge problem in Southern California.
- Part B: Test-takers repeat sentences dictated over the phone.
- Ex: Leave town on the next train. 
- Part C: Test-takers answer questions with a single word or a short phrase.
- Ex: would you get water from a bottle or a newspaper?
- Part D: Test-takers hear three word groups in random order and link them in a correctly ordered sentence. Ex: was reading/my mother/a magazine.
- Part E: Test-takers have 30 second to talk about their opinion about some topic that is dictated over the phone. Topics center on family, preferences, and choices.
- Scoring are calculated by a computerized scoring template and reported back to the test-taker within minutes.
Assessment Tasks: Intensive  Speaking 
- Directed response tasks
- Directed response
- Tell me he went home.
- Tell me that you like rock music .
- Tell me that you aren’t interested in tennis.
- Tell him to come to my office at noon.
- Remind him what time it is.
Test of Spoken English Scoring Scale (Read-Aloud Tasks)
- Pronounciation:
- Points:
- 0.0-0.4 frequent errorrs and unintelligible.
- 0.5-1.4 occasionally unintelligible.
- 1.5-2.5 some errors but intelligible.
- 2.5-3.0 occasionally errors but always intelligible.
- Fluency:
- Points: 
- 0.0-0.4 slow, hesitant, and unintelligible.
- 0.5-1.4 non-native pauses and flow that interferes with unintelligible.
- 1.5-2.5 non-native pauses but the flow is intelligible.
- 2.5-3.0 smoothly and effortless.
Variations on Read-Aloud tasks
- Reading a scripted dialogue.
- Reading sentences containing minimal pairs. Ex: Try not to heat / hit the pan too much.
- Reading information from a table or chart.
Read-Aloud Tasks
- Advantages:
- Comparisons between students are quite simply.
- Tests are easy to prepare and to administer.
- Predictable output, practicality, and reliability in scoring.
- Disadvantages:
- It is in authentic, except in situations such as parent reading to child, sharing a story with someone, giving a scripted oral presentation.
- It is not communicative in real context.
Sentence/Dialouge Completion Tasks and Oral Questionnaires
- First, test-takers are given time to read through the dialogue to get its gist, then the tape/teacher produces one part orally and the test-taker responds.
- Ex: (p. 150) short  dialogue (p. 151)
- Advantage: more time to anticipate an answer, no potential ambiguity created by aural misunderstanding (oral interview).
Picture-Cued tasks
- A picture-cued stimulus requires a description from the test-taker. It may elicit a word, a phrase, a story, or incident. 
- Scoring scale for intensive tasks:
- 2 comprehensible; acceptable target form 
- 1 comoprehensible; partially correct
- 0 silence; or seriously incorrect.
A Scale for evaluating Interviews
- Grammar
- Vocabulary
- Comprehension
- Fluency
- Pronounciation
- Task (the objective of the elicited task)
- Example: (p. 158)
Translation 
- Translation is a communicative device in contexts where English is not a native language.
- English can be called on to be interpreted as a second language.
- Condition may vary from an instant translation of native word, phrase, or sentence to translation of longer texts.
- Advantages: the control of the output and easily specified scoring.
Responsive Speaking 
- Question and answer
- Ex: 1. What is this called in English?
- (to elicit a predetermined correct response)
1. What are the steps governments should take, if any, to stem the rate of de-forestations in tropical countries? (given more opportunity to produce meaningful language in response)
Question Eliciting Open-Ended Responses
- 1. What do you think about the weather today?
- 2. Why did you choose your academic major?
- 3. a. Have you ever been to the U.S before?
                          b. What other counties have you visited?
                          C . Why did you go there? What like best about it?
Giving isntructions and Directions
- Ex: how to operate an appliance, how to put a bookshelf together, or how to create a dish.
- Scoring: based on
1. Comprehensibility
2. Specified grammatical/discourse categories.
Eliciting Instructions or Directions
- Test-takers hear:
- Describe how to make a typical dish
- What’s a good recipe fro making___?
- How do you access email on a PC computer?
- How do I get from___to___in your city?
- Test-takers respond.
Consideration of Paraphrasing
- 1. Elicit short stretches of output
- 2. The criterian been assessed:
a. Is it a listening task more than production?
b. Does it test short-term memory rather than linguistic ability?
c. How does the teacher determine scoring of responsibility?
Test of Spoken English (TSE)
- TSE is a 20-minute audiotaped test of oral language ability within an academic or professional environment. 
- TSE scores are used by many North American institutions of higher education.
- The tasks are designed to elicit oral production in various discourse categories. (p. 163)
- Ex: sample items in TOEFL (p. 164)
- Scoring: a holistic score ranging from 20 to 60 (performance, function, appropriateness, and coherence)
Interactive Reading 
- Oral interview: a test administration and a test-taker sit down in a direct face-to-face exchange and produced through a protocol of question and directives.
- It various in lenghth from 5 to 45 minutes, depending on purpose and context. Placement interviews may need only 5 minutes while Oral Preficiency Interview (OPI) may require an hour.
A Framework for Oral Proficiency Testing
- Four states: Warm-up, level check, Probe, and Wind-down.
- Warm-up: The interviewer directs mutual introductions, helps the test-taker become confortable with the situation, apprises the format, and reduces anxieties.
- Level check: Through preplanned Qs, the test-takers respond using expected forms and functions. Linguistic target criteria are scored.
- Probe: In this phase, test-takers go to the heights of their ability and extend beyond the limits of the interviewer’s expectation. 
- Through probe questions, the interviewer discovers the test-taker’s proficiency. At the lower levels of proficiency, probe items may demand a higher range of vocabulary and grammar than predicted. At the higher levels, probe items will ask the t-t to give an opinion, to recount a narrative or to respond to questions.
- Wind-down:the interviewer encourages the teks-taker to relax with  some easy questions, sets the t-t’s mind at ease, and provides information about when and where to obtain the results of the interview. This part is not scored.
- Content specifications (p. 169)
- Sample questions (p. 169-170)
Sample Questions of n Oral Interview
- 1. Warm-up:
- How are you? /What’s your name? /What country are you from? /Let me tell you about this interview.
- 2. Level check:
- How long have you been in this city? /Tell me about your family./What is your major?/How long have you been working at your degree?/What do you like your hobby?
- What is your favorite food?/Tell me about your exciting experience you’ve had.
- 3. Probe:
- What are you goals for learning English in this program?/Describe your academic field to me. What do you like or dislike about it?/Desribe someone you greatly respect, and tell me why you respect that person./If you were [president, prime minister] of your country, what would you like to change about your country?
- 4. Wind-down:
- Did you feel okay about this interview?/You’ll get your results from this interview next week./Do you have any question to ask?/It was interesting to talk with you. Best wishes.
The Success of an Oral Interview
- Clear administrative procedures (practicallity)
- Focusing the questions and probes on the purpose of the assessment (validity)
- Biased for best performance 
- Creating a consistent, workable scoring system (reliability)
- Descriptions of the Oral Preficiency Scoring Categories (p. 172-173)
Role Play
- It is a popular pedagogical activity in communicative language-teaching classes.
- The test administrator must determine the assessment objectives of the role play, then devise a scoring technique that pinpoints those objectives.
- Ex: “Pretended that you’re a tourist asking me for directions”, “You are buying a necklace from me in a flea market, and want a lower price”.
Discussion and Conversations
- As informal techniques to assess learners, D and C offer a level of authenticity and spontaneity that other assessment techniques may not provide. 
- (clarifying, questioning, paraphrasing, intonation patterns, body language, eye contact, and other sociolinguistic factors)
- Games
- Oral Proficiency Interview (OPI) guidelines (p. 177)
Designing Assessment: Extensive Speaking 
- Extensive speaking tasks are frequently variations on monologues, usually with minimal verbal interaction.
- Oral Presentation:
- Ex: presenting a report, a paper, a marketing plan, a sales idea, a design of a new product, or a method.
- Rules for effective assessment: (a) specify the criterion, (b) set appropriate tasks, (c) Elicit optimal output, and (d) establish practical, reliable scoring procedures. 
- Oral presentation checklist:
3. Excellent; 2. Good 1. Fair 0. Poor
- Content: 
The purpose or objective of the presentation was accomplished.
- The introduction was lively and got my attention. 
- The main idea or point was clearly stated toward the beginning.
- The supporting points were clearly expressed and supported well by facts and argument.
- The conclusion restated the main idea or purpose.
- Delivery 
- The speaker used gestures an d body language well.
- The speaker maintained eye contact with the audience.
- The speaker’s language was natural and fluent.
- The volume of speech was appropriate.
- The rate of speech was appropriate.
- The pronounciation was clear and comprehensible.
- The grammar was correct and didn’t prevent understanding.
- Used visual aids, handouts, etc., effectively.
- Showed enthusiasm and interest.
- Responded to audience questions well.
Picture-Cued Story-Telling
- At this level, a picture/a series of pictures is used as a stimulus for a longer story or description.
- The objective of eliciting narrative discourse needs to be clear. (p. 181) (Tell and use the p. tense)
- For example, are you testing for oral vocabulary, (girl, telephone, wet) for time relatives (before, after, when), for sentence connectors (then, so), for past tense of irregular verbs (woke, drank, rang), or for fluency in general?
- Criteria for scoring need to be clear.
Retelling a Story, News Event
- Test-takers hear / read a story or news event that they are asked to retell.
- It differs from the paraphrasing task discussed above in that it is a longer stretch of discourse and a different genre.
Translation (of Extended prose)
- Longer texts are presented for the test-taker to read in the native language and then translate into English.
- Text vary in forms: dialogue, directions, play, movie, etc.
- Advantages: the control of the content, vocabulary, the grammatical and discourse features.
- Disadvantages: a highly specialized skill is needed.
REFERENCE:
Brown. H.D. 2004. LANGAUGE ASSESSMENT: Principle and Classroom Practices: Pearson.
- Assessing Listening (Chapter 6, p. 116-139)
- Assessing Speaking (Chapter 7, p. 140-184)
https://www.slideshare.net/mobile/irakhwati/assessing-listening-42086143. Accessed on 7 May 2020
https://www.slideshare.net/mobile/kheansokheng52/chapter-7-assessing-speaking. Accessed on 10 May 2020