Monday, September 9, 2019
Final Exam Assignment Example | Topics and Well Written Essays - 1250 words - 4
Final Exam - Assignment Example P3 initiates RNAIII that codes 6-hemolysin. P2 operon is 3Kb in length. It is unique when compared to P3 since it has up to four open reading frames, agrA, agrB, agrC, and agrD. The agrA controls the sensory transduction while agrB is responsible for the production of histidine phosphokinase in bacteria. Due to mutations, defects may occur in any one of four open reading frames, the resulting strains donââ¬â¢t have P2 and P3 transcripts, beside they become arg-. These strains cannot initiate transcription from the P2 and P3. Insertion at C1Ai site in the RNARIII region in P3 results in inactivation of arg functions. 2. a) à ²-lactamase has higher activity as compared to the normal P2 and P3 promoters, and thus à ²-lactamase can make other existing non-useful plasmids like p1524 unstable. Using P2 and P3 instead of à ²-lactamase allows expression of the p15424, and this interferes with the results. b) RN6390B, arg+ produces stronger signals for alpha and beta hemolysins than does RN69 11, a null-arg strain, whose signal is extremely weak. If the researcher used arg+ instead, he would not obtain the same results due to interference by other unnecessary signals. c) When a B lymphocyte secretes antibodies that are directed towards a specific epitope on an antigen, we call the antibodies monoclonal. However, when a significant number of antibodies are secreted that have different affinities and specificities towards different epitopes, we call them polyclonal antibodies. B- hemolysin antibody is a polyclonal antibody, has a wide range of affinities and specificities and thus it can detect any antigen present in its environment. d) In anti-b blot, protein A produces a stronger signal for arg-a42 and no signal in arg-a40. The signal for B-hemolysin is present only in arg-a40 and not arg-a42. These results mean that different regions of P3 codes for different chemicals. 3. The author used RN7220 because it can increase the hla
Sunday, September 8, 2019
Dentist Office Proposal Research Example | Topics and Well Written Essays - 750 words
Dentist Office - Research Proposal Example This happens because the government posts a facilityââ¬â¢s Medicare acceptance standing on medical assistance literature material and on government websites. The state government further offers free advertisements that attract patients to the healthcare facility (Ketler 49). This is considerably helpful in the foremost days of health care practice when the facility leaders need to strengthen their business practice in the community for them to remit meager business loans and debts from the medical school. There is an assured income source when a health care organization accepts Medicare. The state and federal governments jointly fund Medicaid social programs to avail services on a continual fashion. The government assures payment if the eligibility rule concerns a medical procedure that the organizationââ¬â¢s medical practice prescribes (Sisks 52). The health care organization does not have to hunt the patient down in order to secure their income or adjusting treatment fees to make sure that the patient could afford or medical care. This offers security in the projection of anticipated revenues and enable the medical providers meet their monetary obligations. Joining Medicaid would ensure a positive economic impact on the business environment and the entire state economy. Through this, there would be augmented job opportunities, state and income tax revenue within the entire healthcare sector and more owing to the induced multiplier effect of expenditure (Sisks 54). Medicaid has an immensely competitive health insurance market in states that have accepted the social program. Joining such a program would immensely benefit the health care organization by placing it in a competitive community where people value, afford and procure health care, thus promoting good health and affluence in the community, state and the entire nation (Ketler 36). Demerits While the federal governmentââ¬â¢s departments and agencies assure payment for eligible Medicaid treatment s and procedures, such entities also take control over the recommended fees for such services. This means that medical practitioners do not have the mandate to determine their charges for clinical procedures on Medicaid patients (Russell 82). Apparently, this makes the health care provider a ââ¬Å"middlemanâ⬠between the government department remitting payments and the patient. The government may control and restrict standard charges, regardless of whether it seems inappropriate for the medical practitioner. The health care center plans to serve its community members, promote good health nationwide, generate income and serve every patient regardless of cost or complexity of reported diseases. However, the government is the chief dictator of medical services that health care practitioners ought to provide under Medicaid. This may push a practitioner to conform to the governmentââ¬â¢s prescribed course of medical care rather that treating the patient in the best way. A low-in come Medicaid patient may be unable to afford the cost of a definitive cure f the government has not prescribed it in the list of medical care available under the Medicaid social program (Sisks 51) The health care provider capacity is insufficient and may worsen in future. The contemporary provider capacity, especially the capacity of emergency departments, safety net providers and primary care
Saturday, September 7, 2019
Ethical issues relating to ife Essay Example | Topics and Well Written Essays - 750 words
Ethical issues relating to ife - Essay Example 83 per cent of these abortions are conducted in the underdeveloped countries while the developed countries account for 17 per cent of these abortions. In such circumstances, study of ethics revives the need to behave ethically. A sound awareness of the principles of ethics is fundamental to the development of morally responsible people who would choose not to abort their children. Sterilization is the term used for killing. Generally, sterilization is used for killing microbes in eatables so that they can be made more hygienic. Killing the fetus is also sterilization. Two drugs, namely Methotrexate & Misoprostol which were previously used for the treatment of cancer and ulcer respectively are now increasingly being used for abortion. Methotrexate poisons the fetus. This is followed by the action of Misoprostol that empties the fetusââ¬â¢s uterus. Methotrexate is a very toxic drug which can kill the mother along with the baby. Hence, this is a very unethical act. Ethics is the study of principles, norms and values that are standardized and mutually accepted by scholars as conducive for the overall betterment of the society. Ethics inculcates a sense in people to make well-informed decisions in critical situations. Ethics tells how things should be manipulated in a given setting so that the individual and collective losses can be minimized and profitability of the job can be enhanced both for the individual and the nation as a whole. Ethics compels an individual to respect othersââ¬â¢ rights while accomplishing his/her individualistic goals. Ethics disallows the use of such toxic drugs for conducting abortion. Contraception is the name of controlling pregnancy. Ethics committee plays a very important role in contraception in that it devises the methods to control pregnancy without any loss to the mother. ââ¬Å"In the broadest general terms an ethics committee, satisfies the condition of the Federal Sentencing
Biography of Plato Essay Example for Free
Biography of Plato Essay Plato was a Greek philosopher, mathematician, rhetorician, writer, founder of Academy, and even a double Olympic champion. He was born in 427 BCE in family of wealthy and influential Athenian parents: Ariston and Perictione. Platos real name was Aristocles. For his athletic figure his wrestling coach called him Plato, which means ââ¬Å"broadâ⬠. As Plato was from a wealthy family, he got the best teachers of that time, who taught him music, grammar and athletics. At the age of 20 years old, Plato meets Socrates, who became his teacher, mentor and closest person. Eight years with Socrates influenced Platos life path. After Socrates died, Plato travels to Egypt, Italy, Sicily and Cyrene. Then he came back and opened his famous philosophical Academy. The Academy was an institution of higher education. Such philosophers as Aristotle, Heraclitus, Crates and Xenocrates attended Platos Academy. Platoââ¬â¢s writings are dialogues and letters to his teacher Socrates, which talks about a variety of different topics, ranging from philosophy to ethic, from mathematics to rhetoric. In these dialogs Plato used Socrates as a fictional person. His early dialogues are typically devoted to investigation of a single issue, where results are rarely achieved. The middle age dialogs developed, expressed, and defended Platos conclusions about central philosophical issues. And his later writings often modify or abandon the structure of a dialogue, they were critical examinations of the theory of forms, discussions of the problem of knowledge and cosmological speculations. Platos most famous works are: The Apology of Socrates, The Symposium and The Republic. Plato started the very first University in Europe ââ¬â The Academy in 387 BC in Athens. Though the Academic was not open for the public, it did not charge fees for education there. Therefore no formal teachers or students, but there was unspoken distinction between teachers and students. One of the most famous Platos student, who also attended Academy for more than 20 years was Aristotle. There are evidences of lectures given in the Academy, such as Platos lecture On the Good and The use of dialectic. Academy continued on for nearly 1000 years until it was closed by emperor Justinian, because it was believed not to follow the Christian religion. Plato died on the day of his birthday at 347 BC. It is unknowing how he died, there are multiple versions from committing suicide to according to The American Scholar, Plato died in his bed, whilst a young girl played the flute to him. Plato moves his finger to indicate the beat and rhythm to get the right measure for her. When the girl gets the right measure, Plato died listening to the correct measure. References: Schall, James V. , ââ¬Å"On the Death of Plato. â⬠The American Scholar 65 (1996): 401-15. Print. Kemerling, Garth. ââ¬Å"Plato. â⬠Philosophy pages. Web. 9 Aug. 2006 Kraut, Richard, Plato, The Stanford Encyclopedia of Philosophy (Fall 2011 Edition), Edward N. Zalta (ed. ), .
Friday, September 6, 2019
Francis Scott Key Fitzgerald Essay Example for Free
Francis Scott Key Fitzgerald Essay Many characters in the Great Gatsby parallel to Fitzgerald life. For example, Daisy, the women Jay Gatsby has been basing his whole life on, is similar to Zelda Sayre, who would not marry Fitzgerald at first because of his lack of success. Gatsby and Fitzgerald both met vital women to their lives at dances, and both while they were stationed at camps in the army. Gatsby met Daisy at Camp Taylor in Illinois, where they danced and fell in love. However, after Gatsby went off to war, they never got back together again. Fitzgerald met his wife, Zelda, at Camp Sheridan in Alabama. Instead of going off to war (his regiment was ready to go to Europe, but the Armistice came before they could leave the States), he went to New York to get enough money to marry Zelda. In the movie version, Daisy tells Gatsby that Rich girls dont marry poor boys. This line was taken straight out of Fitzgeralds life. The father of his first love, a young woman by the name of Ginevra King, supposedly told him that after Fitzgerald asked for Ginevras hand in marriage. The Great Gatsby is F. Scott Fitzgeralds most renowned book, and still one of the most read novels in American literature. A book with this much success was obviously was a product of great influence. The Great Gatsby draws many extensive parallels between F. Scott Fitzgeralds life and this novel. These similarities range from basing characters off important people from his personal life to interweaving intricate love relationships he went through into the novel to recreating the American Dream. The book comes as a direct result of many of the events in Fitzgeralds early life. First off, are the most noticeable parallels, the character he chooses. Fitzgerald parallels himself in two of the main characters in The Great Gatsby, Jay Gatsby, and Nick Carraway. Nick represents Fitzgeralds passive, or indecisive, and observant characteristics. On the other hand, Gatsby shows Fitzgeralds passionate and active attributes.
Thursday, September 5, 2019
Tests of Significance: Uses and Limitations
Tests of Significance: Uses and Limitations Abstract Statistical tools are undoubtedly important in decision making. The use of these tools in everyday problems has led to a number of discoveries, conclusions and enhancement of knowledge. This ranges from direct calculations using general statistical formulas to formulas integrated in Statistical software to fasten the process of decision making. Statistical tools for testing hypothesis, significance tests are strong but only if used correctly and in good understanding of their concepts and limitations. Some researchers have indulged into wrong usage of this tests leading to wrong conclusions. This paper looks at the different significance tests (both parametric and non-parametric tests) their uses, when to be used and their limitations. It also evaluates the use of Statistical Significance tests in Information Retrieval and then proceeds to check the different significant tests used by researchers in the papers submitted to Special Interest Group on Information Retrieval (SIGR) in the period 2006, 2007 and 2008. For the combined period 2006-2008, including the years 2006 and 2008, of the papers submitted had statistical tests used and of these tests were used wrongly. Key Words: Significance Test, Information Retrieval, Parametric Tests, Non-parametric Tests, Hypothesis Testing Chapter One 1.0 Introduction Statistical methods play a very important role in all aspects of research, ranging from data collection, recording, analysis, to making conclusions and inferences. The credibility of the research results and conclusions will depend on each and every step mentioned above; any fault made in these steps can render a research carried out for several years, spending millions of shillings to be worthless. This does not mean carrying any test and mincing figures shows that statistics has been used in the given research; the researcher should be able support why he or she used that specific test or method. Misuse of significance test is not new in the world of science. According to Campbell (1974), there are different types of statistical misuse: Discarding unfavorable portion of data This occurs when the researcher selects only a portion of data which produces the results that he/she requires perfectly while discarding the other portion. After a well done research, the researcher might get values that are not consistent to what he/she was expecting. This researcher might decide to ignore this section of data during the analysis so as to get the ââ¬Å"expected resultsâ⬠. This is a wrong take since the inconsistent data could give very new thoughts in that particular field that is if these irregularities are checked and explained why they occurred, more ideas abut that area can be explored.. Overgeneralization Sometimes the conclusions from a research can only work on that particular research problem but the researcher might blindly generalize the results obtained to other kinds of research similar or dissimilar. Overgeneralization is a common mistake in current research activities. A researcher after successfully completing a research on a particular field, he/she might be tempted to make generalizations reached in this research to other fields of study without regarding the different orientations of these different populations and assumptions in them. Non representative sample This arises when the researcher selects a sample which produces results geared towards his/her liking. Sample selected for a particular study should be one that truly represents the entire population. The procedure of selecting the sample units to be used in the study should be done in an unbiased manner. Consciously manipulating data Occurs when a researcher consciously changes the collected data in order to reach a particular conclusion. This is mainly noticed when the researcher knows exactly what the customers aim are, so the researcher changes part of the data so that the aim of that research is covered strongly. For example if a researcher is carrying out a regression analysis and does a scatter plot, if he/she sees that there are many out liers,the researcher might decide to change some values so that the scatter plot appears as a straight line or something very close to that. This act leads to results which are appealing to the customer and the eyes of other user but in real sense does not give a clear indicator of what is really happening in the population at large. 1.0.5 False correlation This is observed when the researcher claims that one factor causes the other while in real sense both two factors are caused by another hidden factor which was not identified during the study. Correlation researches are common in social sciences and sometimes they are not adequately approached, this leads to wanting results. In correlation studies say to check if variable X causes variable Y, in real sense there are four possible things. The first one is that X causes Y,secondly Y causes X, third is X and Y are both caused by another unidentified variable say Z and lastly the correlation between X and Y occurred purely by sheer luck. All these possibilities should be checked while doing these kinds of study to avoid rushing into wrong conclusions. False causality can be eliminated in studies by using two groups for the same experiment that is the ââ¬Å"control group (the one receiving a placebo)â⬠and the ââ¬Å"treatment group (the one receiving the treatment)â⬠. Even though this method is efficient, implementing it raises very many challenges. There are ethical issues like when one patient is given a placebo (effect less drug) without his/her conscious and the other group given the right drug. One question comes to mind; is it ethical to do this to the first group? Carrying out the experiment in parallel for two different groups can also prove to be very expensive. 1.0.6 Overloaded questions. The questions used in survey can really affect the outcome of the survey. The structure of questions in a questionnaires and the method of formulating and asking the questions can influence the manner in which the respondent answers the questions. Long wordy questions in a questionnaire can be too boring to a respondent and he/she might just fill the questionnaire in a hurry so that he/she finishes it but does not really care about the answers that he/she has provided. The framing of questions can also yield leading questions. Some questions will just lead the respondent on what to answer for example ââ¬Å"The government is not offering security to its citizens, do you agree to this? (Yes or No)â⬠Use of statistical significance has been with us for more than 300 years (Huberty, 1993).Despite being used for a long time, this field of decision making is cornered by criticism from all directions, which has led to many researchers writing materials digging into the problems of statistical significance testing. Harlow et. al (1997), discussed the controversy in significance testing in depth. Carver (1993) expressed dislike of significance tests and clearly advocated researchers to stop using them. In his book, How to Lie with Statistics, Huff (1954) outlined errors both intentional and unintentional and misinterpretations made in statistical analyses in depth. Some journals e.g. American Psychological Association (APA) recommended minimum use of statistical significance test by researchers submitting papers for publications (APA, 1996), though not revoking the use of the tests. With the relentless criticism, other researchers have not given up on using statistical significance testing but have clearly encourage users of the tests to have good knowledge in them before making conclusions using them. Mohr (1990) discussed the use of these tests and supported their use but warning researchers to know the limitations of each tests and correct application of the tests so as to make a correct inferences and conclusions. In his paper, Burr (1960) supported the use of statistical significance test but requested researchers to make allowances for existence of statistical errors in the data. Amidst these controversies, statistical significance testing has been applied to many areas of research and remarkable achievements have been recorded. One such area is the information retrieval (IR). Significant tests have been used to compare different algorithms in information retrieval. 1.1.0 Information retrieval Information retrieval is defined as the science of searching databases, World Wide Web and other documents looking for information on a particular subject. In order to get information, the user is required to enter keywords which are to be used for searching, a combination of objects containing the keywords are usually returned from which the user looking for information can single out and pick one which gives him or her the much required information. The user usually progressively refines the search by narrowing down and using specific words. Information retrieval has developed as a highly dynamic and empirical discipline, requiring careful and thorough evaluation to show the superior performance of different new techniques on representative document collections. There are many algorithms for Information Retrieval .It is usually important to measure the performance of different information retrieval systems so as to know which one gives the required information faster. In order to measure information retrieval effectiveness, three test items are required; (i) A collection of documents on which the different retrieval methods will be run on and compared. (ii) A test collection of information needs which are expressible in terms of queries (iii)A collection of ââ¬Å"relevance judgmentâ⬠that will distinguish on whether the results returned are relevant to the person doing the search or they are irrelevant. A question might arise on which collection of objects to be used in testing different systems. There are several standard test collections used universally, these include; (i) Text Retrieval Conference (TREC). ââ¬â This a standard collection comprising 6 CDs containing 1.89 million documents (mainly, but not exclusively, newswire articles) and relevance judgments for 450 information needs, which are called topics and specified in detailed text passages. Individual test collections are defined over different subsets of this data. (ii)GOV2-This was developed by The U.S. National Institute of Standards and Technology (NIST).It is a 25 paged collection of web pages. (iii) NII Test Collections for IR Systems (NTCIR)-This is also a large test collection focusing mainly on East Asian language and cross-language information retrieval, where queries are made in one language over a document collection containing documents in one or more other languages. (iii) Cross Language Evaluation Forum (CLEF). This Test collection is mainly focused on European languages and cross-language information retrieval. (iv) 20 Newsgroups. This text collection was collected by Ken Lang. It consists of 1000 articles from each of 20 Usenet newsgroups (the newsgroup name being regarded as the category). After the removal of duplicate articles, as it is usually used, it contains 18941 articles. (v) The Cranfield collection. This is the oldest test collection in allowing precise quantitative measures of information retrieval effectiveness, but is nowadays too small for anything but the most elementary pilot experiments. It was collected in the United Kingdom starting in the late 1950s and it contains 1398 abstracts of aerodynamics journal articles, a set of 225 queries, and exhaustive relevance judgments of all (query, document) pairs. There exist several methods of measuring the performance of retrieval systems namely; Precision, Recall, Fall-Out, E-measure and F-measure just to mention a few since researchers are coming up with other new methods. A brief description of each method will shade some light. 1.1.1 Recall Recall in information retrieval is defined as the number of relevant documents returned from a search divided by the total number of documents that can be retrieved from a database. Recall can also be looked at as evaluating how well the method that is being used to retrieve information gets the required information. Letbe the set of all retrieved objects andbe the set of all relevant objects then, Recall(1.1) As an example, if a database contains 500 documents, out of which 100 contain relevant information required by a researcher, the complement ,number of documents not required = 400. If the researcher uses a system to search for the documents in this database and it return 100 documents of which all of them are relevant to the researcher, then the recall is given by: Recall Supposed that out of 120 returned documents, 30 are irrelevant, then the recall would be given by Recall 1.1.2 Precision Precision is defined as the number of relevant documents retrieved from the system over the total number of documents retrieved in that search. It valuates how well the method being used to retrieve information filters the unwanted information. Letbe the set of all retrieved objects andbe the set of all relevant objects then, Precision(1.2) As an example, if a database contains 500 documents, out of which 100 contain relevant information required by a researcher, the complement ,number of documents not required = 400. If the researcher uses a system to search for the documents in this database and it returns 100 documents of which all of them are relevant to the researcher, then the precision is given by: Precision Supposed that out of 120 returned documents, 30 are irrelevant, then the precision would be given by Precision Both precision and recall are based on one term; Relevance Oxford dictionary defines relevance as ââ¬Å"connected to the issue being discussedâ⬠. Yolanda Jones (2004) identified three types of relevance, namely; Subject relevance which is the connection between the subject submitted via a query and subject covered by returned texts. Situational relevance: connection between the situation being considered and texts returned by database system. Motivational relevance: connection between the motivations of a researcher and texts returned by database system. There are two measures of relevance; Novelty Ratio: This refers to the proportion of items returned from a search and acknowledged by the user as being relevant, of which they were previously unaware of. Coverage Ratio: This refers to the proportion of items returned from a search out of the total relevant documents that the user was aware of before he/she started the search. Precision and recall affect each other i.e. increase in recall value decreases precision value. If one increases a systemââ¬â¢s ability to retrieve more documents, this implies increasing recall, this will have a drawback since the system will also be retrieving more irrelevant documents hence reducing the precision of that system. This means that a trade-off is required in these two measures so as to ensure better search results. Precision and recall measures make use of the following assumptions They make the assumption that either a system returns a document or doesnââ¬â¢t. They make the assumption that either the document is relevant or not relevant, nothing in between. New methods are being introduced by researchers which rank the degree of relevance of the documents. 1.1. 3 Receiver Operating Characteristics (ROC) Curve This is the plot of the true positive rate or sensitivity against the false positive rate or (1 âËâ specificity).Sensitivity is just another term for recall. The false positive rate is given by. An ROC curve always goes from the bottom left to the top right of the graph. For a good system, the graph climbs steeply on the left side. For unranked result sets, specificity, given bywas not seen as a very useful idea. Because the set of true negatives is always so large, its value would be almost 1 for all information needs (and, correspondingly, the value of the false positive rate would be almost 0). 1.1.4 F-measure and E-measure This is defined as the weighted harmonic mean of the recall and precision. Numerically, it is defined as (1.3) Whereis the weight. Ifis assumed to be 1, then (1.4) The E-measure is given by(1.5) E ââ¬âmeasure has a maximum value of 1.0, 1.0 being the best. 1.1.5 Fall-Out This is defined as the proportion of irrelevant documents that are returned in a search out of all the possible irrelevant documents. Fall out(1.6) It can also be defined as the probability of a system retrieving an irrelevant document. These are just a few methods of measuring performance of search systems. Then after looking after one system, there arise a problem of comparing two systems or algorithms, that is, is this system better than the other one? To answer this question, scientist in Information retrieval use statistical significance tests to do the comparisons in order to establish if the difference in systems performance are not by chance. These tests are used to confirm beyond doubt that one system is better than another. Statement of the problem Statistical inference tools like statistical significance tests are important in decision making. Their use has been on the rise in different areas of research. With their rise, novel users make use of these tools but in questionable manners. There are many researchers who do not understand the basic concepts in statistics leading to misuse of the tools. Any conclusions reached from a research might be termed bogus if the statistical tests used in it are shoddy. More light needs to be shade in this area of research to ensure correct use of these tests. Researchers in Information Retrieval also use these tests to compare systems and algorithms, are the conclusions from these tests truly correct? Are there any other ways of comparison which minimize the use of statistical tests? Objectives of the study The objectives of this study are: Investigate use and misuse of statistical significance tests in scientific papers submitted by researchers to SIGIR. Shade light on different statistical significance tests their use, assumptions and limitations. Identify the most important statistical concepts that can provide solutions to the problems of statistical significance in scientific papers submitted by researchers to SIGIR. Investigate the reality of the problems of statistical significance in scientific papers submitted by researchers to SIGIR. Investigate the use of statistical significant tests used by researchers in Information Retrieval Discover the availability of statistical concepts and methods that can provide solutions to the problems of statistical significance in scientific papers submitted by researchers to SIGIR Chapter Two This section of this paper has been divided into three major parts, the sample selection and sample size choosing which will discusses methods of selecting a sample and the size of the sample to be used in a given research, the second part deals with statistical analysis methods and procedures, mainly in significance testing and the third part discusses other statistical methods that can be used in place of statistical significance test. 2.0 Sample Selection and Sample Size 2.0.1 Sample selection Sampling plays a major role in research, according to Cochran (1977), sampling is the process of selecting a portion of the population and using the information derived from this portion to make inferences about the entire population. Sampling has several advantages, namely; (i)Reduced cost For example it is very expensive to carry out a census than just collecting information from a small portion of the population. This is because only a small number of measures will be made so only a few people will be hired to do the job compared to complete census which will require a large labor force. (ii)Greater speed during the process(less time) Since only a few people will be used or rather only a few items will be measured, the time for doing the measurement will be reduced and also summarization of the data will be quick as opposed to when measures are taken for the whole population. (iii)Greater accuracy Since only a few people will be considered in the process, the researchers will be very thorough as compared to the entire population which will see the researchers get tired in the middle of the process leading to lousy collection of data and shoddy analysis. The choice of the sampling units in a given research may affect the credibility of the whole research. The researcher must make sure that the sample being used is not biased, that is it represents the whole population. There are several methods of selecting samples to be used in a study. A researcher should always make sure that the sample drawn is large enough to be a representative of the population as a whole and at the same time manageable. In this section the two major types of sampling, random and non-random, will be examined. 2.0.1.1 Random sampling In random sampling, all the items or individuals in the population have equal chances of being selected into the sample. This procedure ensures that no bias is introduced during the selection of sample units since a n items selection will be only by chance and will not depend on the person assigned with the duty of coming up with the sample. There exist five major random sampling techniques, namely; simple random sampling, multi-stage sampling, stratified sampling, cluster sampling and systematic sampling. The following section discusses each of these. 2.0.1.1.1 Simple random sampling In simple random sampling, each item in the population has the same and equal chance of being included in the sample. Usually each sampling unit is assigned a unique number and then numbers are generated using a random number generator and a sampling unit is included in the sample if its corresponding number is generated from the random number generator. One advantage attributed to simple random sampling is its simplicity and ease in application when dealing with small populations. Every entity in the population has to be enlisted and given a unique number then their respective random numbers be read. This makes this method of sampling very tedious and cumbersome especially where large populations are involved. 2.0.1.1.2 Stratified sampling In stratified random sampling, the entire population is first divided into N disjoint subpopulations .Each sampling unit belongs to one and only one sub population. These sub populations are called strata, they might be of different sizes and they are homogenous within the strata and each stratum completely differs with the other strata. It is from these strata that samples are drawn for a particular study. Examples of strata that are commonly used include States, provinces, Age and Sex, religion, academic ability or marital status etc. Stratification is most useful when the stratifying variables are simple to work with, easy to observe and closely related to the topic of the survey (Sheskin, 1997). Stratification can be used to select more of one group than another. This may be done if it is felt that the responses obtained vary in one group than another. So, if the researcher knows that every entity in each group has much the same value, he/she will only need a small sample to get information for that group; whereas in another group, the values may differ widely and a bigger sample is needed. If you want to combine group level information to get an answer for the whole population, you have to take account of what proportion you selected from each group. This method is mainly used when information is required for only a particular subdivision of the population, administrative convenience is an issue and the sampling problems differ greatly in different portions of the population of study. 2.0.1.1.3 Systematic sampling Systematic sampling is quite different from the other methods of sampling, supposed the population contains N units and a sample of n units is required, a random number is generated using the random number generator, call it k, then a unit(represented as a number) is drown from the sample then the researcher picks every kth unit thereafter. Consider the example that k is 20 and the first unit that is drawn is 5, the subsequent units will be 25,45,65,85 and so on. The implication of this method is that the selection of the whole sample will be determined by only the first item since the rest will be obtained sequentially. This type is called an every kth systematic sample. This technique can also be used when questioning people in a sample survey. A researcher might select every 15th person who enters a particular store, after selecting a person at random as a starting point; or interview the shopkeepers of every 3rd shop in a street, after selecting a starting shop at random. It may be that a researcher wants to select a fixed size sample. In this case, it is first necessary to know the whole population size from which the sample is being selected. The appropriate sampling interval, I, is then calculated by dividing population size, N, by required sample size, n. This method is advantageous since it is easy and it is more precise than simple random sampling. Also it is simpler in systematic sampling to select one random number and then every kth member on the list, than to select as many random numbers as sample size. It also gives a good spread right across the population. A disadvantage is that the researcher may be forced to have a starting list if he/she wishes to know the sample size and calculate the sampling interval. 2.0.1.1.4 Cluster sampling The Austarlian Bureau of Statistics insinuates that cluster sampling divides the population into groups, or clusters. A number of clusters are selected randomly to represent the population, and then all units within selected clusters are included in the sample. No units from non-selected clusters are included in the sample. They are represented by those from selected clusters. This differs from stratified sampling, where some units are selected from each group. The clusters are heterogeneous within each cluster (that is the sampling units inside a cluster vary from each other completely) and each cluster looks alike with the other clusters. Cluster sampling has several advantages which include reduced costs, simplified field work and administration is more convenient. Instead of having a sample scattered over the entire coverage region, the sample is more concentrated in relatively few collection points (clusters). Cluster sampling provides results that are less accurate compared to stratified random sampling. 2.0.1.1.5 Multi-stage sampling Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster, rather than including all units in the cluster. The Australian Bureau of Statistics postulates that multi-stage sampling involves selecting a sample in at least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to contain more population units than are required for the final sample. In the second stage, population units are chosen from selected clusters to derive a final sample. If more than two stages are used, the process of choosing population units within clusters continues until the final sample is achieved. If two stages are used then it will be called a two stage sampling, if three stages are used it will be called a three stage sampling and so on. 2.0.2 Determination of sample size to be used 2.1 Statistical Analysis In this section, different statistical tests are discussed in details in their general form, then move to discussed how each of them(the ones used in IR) are applied to information retrieval. Only some of these tests are used to compare systems or/and algorithms. In this paper we look at three sections of statistical analysis, namely: (i) Summarizing data using a single value. (ii) Summarizing variability. (iii) Summarizing data using an interval (no specific value) In the first case, we have the mean, mode, median etc and in the second case, we look at variability in the data and in the third case we look at the confidence intervals, parametric and nonparametric tests of hypothesis testing 2.1.1 Summarizing data using a single value In this case, the data being analyzed is represented by a single value, example for this scenario are discussed below: 2.1.1.1 Mean There are three different kinds of mean: (i)Arithmetic mean (ii)Geometric Mean (iii)Harmonic mean (i) Arithmetic mean This is computed by summing all the observations then dividing by the number of observations that you have collected. Letbe n observations of a random variable X. The arithmetic mean is defined as Arithmetic mean When to use the arithmetic mean The arithmetic mean is used when: When the collected data is a numeric observation. When the data has only one mode (uni-modal) When the data is not skewed i.e. not concentrated to extreme values. When the data does not have many outliers (very extreme values) The arithmetic mean is not used when: You have categorical data When the data is extremely skewed. (ii) Geometric mean This is defined as the product of the observations, everything raised to power of, usually n. Letbe n observations of a random variable X. The geometric mean is defined as Geometric mean The Geometric mean is used when: The observations are numeric. The item that we are interested in is the product of the observations. (iii) Harmonic mean This is defined as the number of observations divide be the sum of reciprocals of the observations. Letbe n observations of a random variable X. The harmonic mean is defined as Harmonic mean The Harmonic mean is used when: The average can be justified for the reciprocal of the observations. 2.1.1.2 Median This is defined as the middle value of the observations. The observations are first arranged in ascending or descending order then the middle value is taken as the median. The median is used when: When the observations are skewed. The observations have a single mode. The observations are numerical. The median is not used when: We are interested in the total value. 2.1.1.3 Mode This is defined as the largest value in the given dataset or the value that has the highest frequency of occurrence. The mode is used when: The dataset is categorical. The dataset is both numeric and multimodal. 2.1.2 Summarizing variability Variability in a data can be summarized using the following measures: 2.1.2.1 Sample variance Letbe n observations of a random variable X, then the Sample variance, is given by The standard deviation is used when: The data is normally distributed. 2.1.2.2 The C
Wednesday, September 4, 2019
Spain :: essays research papers
Population The Spanish people are essentially a mixture of the indigenous peoples of the Iberian Peninsula with the successive peoples who conquered the peninsula and occupied it for extended periods. These added ethnologic elements include the Romans, a Mediterranean people, and the Suevi, Vandals, and Visigoths (see GOTHS), Teutonic peoples. Semitic elements are also present. Several ethnic groups in Spain have kept a separate identity, culturally and linguistically. These include the Basques (Euskal-dun), who number about 2.5 million and live chiefly around the Bay of Biscay; the Galicians, numbering about 2.5 million, who live in northwestern Spain; and the nomadic Spanish Gypsies (Gitanos; see GYPSIES). Population Characteristics The population of Spain (1991) was 38,872,268. The estimate for 1993 was 39,207,159; the overall density was about 78 people per sq km (about 201 per sq mi). Spain in increasingly urban with more than three-fourths of the population in towns and cities. "Spain," Microsoft (R) Encarta. Copyright (c) 1994 Microsoft Corporation. Copyright (c) 1994 Funk & Wagnall's Corporation. Forestry and Fishing The cork-oak tree is the principal forest resource of Spain, and the annual production of cork, more than 110,000 metric tons in the mid-1980s, is second only to that of Portugal. The yield of Spain's forests is insufficient for the country's wood-pulp and timber needs. The fishing industry is important to the Spanish economy. The annual catch was about 1.5 million metric tons in 1990 and consisted primarily of tuna, squid, octopus, hake, sardines, anchovies, mackerel, blue whiting, and mussels. Mining The mineral wealth of Spain is considerable. In 1990 annual production included about 36 million metric tons of coal and lignite, 1.5 million tons of iron ore, 255,000 tons of zinc concentrates, 58,400 tons of lead, 5 million tons of gypsum, and 795,000 tons of crude petroleum. The principal coal mines are in the northwest, near Oviedo; the chief iron-ore deposits are in the same area, around Santander and Bilbao; large mercury reserves are located in Almadà ©n, in southwestern Spain, and copper and lead are mined in Andalusia. Other minerals produced are potash, manganese, fluorite, tin, tungsten, wolfram, bismuth, antimony, cobalt, and rock salt. Manufacturing Among the leading goods manufactured in Spain are textiles, iron and steel, motor vehicles, chemicals, clothing, footwear, ships, refined petroleum, and cement. Spain is one of the world's leading wine producers, and the annual output in the late 1980s was about 2.3 million metric tons. The iron and steel industry, centered in Bilbao, Santander, Oviedo, and Avilà ©s, produced about 13.
Subscribe to:
Posts (Atom)