Glossary

聚集偏差 Aggregation bias聚集偏差發生於不正確的推論從已聚集或匯總的資料導出時。包括基於整體本身的特性對整體的各部分特性作出推論。

聚集 Aggregation聚集是指資料被蒐集並以某種方式匯總或按類別分類的過程。

應用統計 Applied statistics應用統計是一種將統計方法應用於各學科和研究領域的統計類型。

算術平均值 Arithmetic mean算術平均值,通常簡稱為平均值,是一種平均或集中趨勢的測量,其資料集中間決定於加總所有數值後,再除以資料的個數。

態度陳述 Attitudinal statement態度陳述是在一個調查的量表問題上,要求一個人來評價他或她對一個特定主題的感受。

軸標籤 Axis labelAn axis label is used on a graph to denote the kind of unit or rate of measurement used as the dependent or independent variable (or variables), and can be found along an axis of a graph. 軸標籤用於圖表上來表示一種單位或作為應變數或自變數(或變數群)的測量率,可以沿著圖形的軸線找到。

反向轉換 Back transformation反向轉換是以數學運算應用於資料中那些已經被轉換的資料集,為了反向轉換或回復該資料到其原始形式的過程。

後端檢查 Back-end check後端檢查也稱為伺服器端檢查,是一種通過電子手段蒐集之資料集的資料驗證,在後端執行,或在資料存儲在電子資料庫之後。

長條圖 Bar graph長條圖或圖表使用水平或垂直條形,以其長度比例性地表示資料集中的值。具有垂直條形的圖表也稱為柱狀圖或圖表。

統計地圖 Cartogram統計地圖是以類別資料覆蓋投影,並使用不同的顏色來表示各個類別的地圖。不同於熱圖,統計地圖不一定使用色彩飽和度來描繪一個類別中值的頻率。

類別資料 Categorical data類別資料是可被分類到相異類別的計量或屬性資料。

類別標籤 Category label類別標籤用於圖表來表示資料類別或群組的名稱,可以是描述性的或一個數值範圍。

圖表標題 Chart title圖表標題是賦予圖表的描述,包括針對目標對象的訊息摘要,也可以包括資料集相關資訊。A chart title is the description assigned to the graph and includes a summary of the message aimed at the target audience and may include information about the dataset.

選取方塊回應 Checkbox responseA checkbox response refers to an answer given to a question in a survey administered in electronic form, for which one or more responses can be selected at a time, as may be indicated by a checkbox an individual clicks on.

彩度 ChromaChroma is the saturation, or vividness, of a hue.

封閉式問題 Closed questionA closed, or closed-ended, question is a type of question featured in a poll or survey that requires a limited, or specific kind of, response and is used to collect quantitative data or data that can be analyzed quantitatively later on.

編碼簿 CodebookA codebook documents the descriptions, terms, variables, and values that are represented by abbreviated or coded words or symbols used in a dataset, and serves as a means for coding and decoding the information.

色彩理論 Color theoryColor theory refers to principles of design focuses on colors and the relationships between them.

連續變數 Continuous variableA continuous variable, or continuous scale, has an unlimited number of possible values between the highest and lowest values in a dataset.

相關 CorrelationCorrelation measures the degree of association, or the strength of the relationship, between two variables using mathematical operations.

CRAAP 測試The CRAAP test denotes a set of questions a researcher may use to assess the quality of source information across five criteria: currency, relevance, authority, accuracy, and purpose.

資料清理 Data cleaningData cleaning, also called data checking or data validation, is the process by which missing, erroneous, or invalid data are determined and cleaned, or removed, from a dataset and follows the data preparation process.

資料標籤 Data labelA data label is used on a graph to denote the value of a plotted point.

資料準備 Data preparationData preparation is the process by which data are readied for analysis and includes the formatting, or normalizing, of values in a dataset.

資料轉換 Data transformationData transformation is the process by which data in a dataset are transformed, or changed, during data cleaning and involves the use of mathematical operations in order to reveal features of the data that are not observable in their original form.

資料視覺化 Data visualizationData visualization, or data presentation, is the process by which data are visualized, or presented, after the data cleaning process, and involves making choices about which data will be visualized, how data will be visualized, and what message will be shared with the target audience of the visualization. The end result may be referred to as a data visualization.

資料 DataData are observations, facts, or numeric values that can be described or measured, interpreted or analyzed.

應變數 Dependent variableA dependent variable is a type of variable whose value is determined by, or depends on, another variable.

敘述統計 Descriptive statisticsDescriptive statistics is a type of applied statistics that numerically summarizes or describes data that have already been collected and is limited to the dataset.

日誌 DiaryA diary is a data collection method in which data, qualitative or quantitative, are tracked over an extended period of time.

二分問題 Dichotomous questionA dichotomous question is a type of closed question featured in a poll or survey that requires an individual to choose only one of two possible responses.

直接測量 Direct measurementDirect measurement is a type of measurement method that involves taking an exact measurement of a variable and recording that numeric value in a dataset.

離散變數 Discrete variableA discrete variable, or a discrete scale, has a limited number of possible values between the highest and lowest values in a dataset.

外部資料 External dataExternal data refer to data that a researcher or organization use, but which have been collected by an outside researcher or organization.

類事實 FactoidA factoid, or trivial fact, is a single piece of information that emphasizes a particular point of view, idea, or detail. A factoid does not allow for any further statistical analysis.

篩選器 FilterA filter is a programmed list of conditions that filters, or checks, items that meet those conditions and may specify further instructions either for the filtered items.

焦點團體 Focus groupA focus group is a data collection method used for qualitative research in which a group of selected individuals participate in a guided discussion.

必填問題 Forced questionA forced question is a type of scaled question featured in a survey that requires an individual to choose from a give range of possible responses, none of which is neutral.

前端檢查 Front-end checkA front-end check, also called a client-side check, is a type of data validation for datasets gathered electronically, and is performed at the front end, or before data are stored in an electronic database.

圖形使用者介面 Graphical user interface (GUI)A graphical user interface, or GUI, is a type of interface that allows a user to interact with a computer through graphics, such as icons and menus, in place of lines of text.

熱圖 Heat mapA heat map is a graph that uses colors to represent categorical data in which the saturation of the color reflects the category’s frequency in the dataset.

直方圖 HistogramA histogram is a graph that uses bars to represent proportionally a continuous variable according to how frequently the values occur within a dataset.

色相 HueA hue, as defined in color theory, is a color without any black or white pigments added to it.

自變數 Independent variableAn independent variable is a type of variable that can be changed, or manipulated, and determines the value of at least one other variable.

推理統計 Inferential statisticsInferential statistics is a type of applied statistics that makes inferences, or predictions, beyond the dataset.

資訊圖表 InfographicAn infographic is a graphical representation of data that may combine several different types of graphs and icons in order to convey a specific message to a target audience.

互動圖像 Interactive graphicAn interactive graphic is a type of visualization designed for digital or print media that presents information that allows, and may require, input from the viewer.

訪談者效應 Interviewer effectInterviewer effect refers to any effect an interviewer can have on subjects such that he or she influences the responses to the questions.

無效資料 Invalid dataAn invalid data are values in a dataset that fall outside the range of valid, or acceptable, values during data cleaning.

引導性問題 Leading questionA leading question is a type of question featured in a poll or survey that prompts, or leads, an individual to choose a particular response and produces a skewed, or biased, dataset.

圖例 LegendA legend is used on a graph in order to denote the meaning of colors, abbreviations, or symbols used to represent data in dataset.

可辨識性 LegibilityLegibility is a term used in typography and refers to the ease with which individual characters in a text can be distinguished from one another when read.

線圖 Line graphA line graph uses plotted points that are connected by a line to represent values of a dataset with one or more dependent variables and one independent variable.

中位數 MedianA median is a type of average, or measure of central tendency, in which the middle of a dataset is determined by arranging its numeric values in order.

元資料 MetadataMetadata are data about other data, and may be used to clarify or give more information about some part or parts of another dataset.

遺漏資料 Missing dataMissing data are values in a dataset that have not been stored sufficiently, whether blank or partial, and may be marked by the individual working with the dataset.

衆數 ModeA mode is a numeric value that appears most often in a dataset.

動態圖像 Motion graphicA motion graphic is a type of visualization designed for digital media that presents moving information without need for input from the viewer.

多系列 MultiseriesA multiseries is a dataset that compares multiple series,or two or more dependent variables and one independent variable.

常態分布 Normal distributionA normal distribution, often called a bell curve, is a type of data distribution in which the values in a dataset are distributed symmetrically around the mean value. Normally distributed data take the shape of a bell when represented on a graph,the height of which is determined by the mean of the sample, and the width of which is determined by the standard deviation of the sample.

開放內容 Open contentOpen content, open access, open source, and open data are closely-related terms that refer to digital works that are free of most copyright restrictions. Generally, the original creator has licensed a work for use by others at no cost so long as some conditions, such as author attribution, are met (See: Suber, Peter. Open Access, Cambridge, Massachusetts: MIT Press, 2012). Conditions vary from license to license and determine how open the content is.

開放性問題 Open questionAn open, or open-ended question, is a type of question featured in a survey that does not require a specific kind of response and is used to collect qualitative data.

顺序偏差 Order biasOrder bias occurs when the sequencing of questions featured in a survey has an effect on the responses an individual chooses, and produces a biased, or skewed, dataset.

異常值 OutlierAn outlier is an extremely high or extremely low numeric value that lies outside the distribution of most of the values in a dataset.

樣式比對 Pattern matchingPattern matching is the process by which a sequence of characters is checked against a pattern in order to determine whether the characters are a match.

點對點網路 Peer-to-peer (P2P) networkA peer-to-peer network, often abbreviated P2P, is a network of computers that allows for peer-to-peer sharing, or shared access to files stored on the computers in the network rather than on a central server.

圓餅圖 Pie chartA pie chart is a circular graph divided into sectors, each with an area relative to whole circle, and is used to represent the frequency of values in a dataset.

群體 PopulationA population is the complete set from which a sample is drawn.

或然率 ProbabilityProbability is the measure of how likely, or probable, it is that an event will occur.

質性資料 Qualitative dataQualitative data are a type of data that describe the qualities or attributes of something using words or other non-numeric symbols.

計量資料 Quantitative dataQuantitative data are a type of data that quantify or measure something using numeric values.

選取回應 Radio responseA radio response refers to an answer given to a question in a poll or survey administered in electronic form, for which only one response can be selected at a time, as may be indicated by a round radio button an individual clicks on.

範圍檢查 Range checkA range check is a type of check used in data cleaning that determines whether any values in a dataset fall outside a particular range.

範圍 RangeA range is determined by taking the difference between the highest and lowest numeric values in a dataset.

原始資料 Raw dataRaw data refer to data that have only been collected, not manipulated or analyzed, from a source.

可讀性 ReadabilityReadability is a term used in typography and refers to the ease with which a sequence of characters in a text can be read. Factors affecting readability include the placement of text on a page and the spacing between characters, words, and lines of text.

抽樣 SampleA sample is a set of collected data.

抽樣偏差 Sampling biasSampling bias occurs when some members of a population are more or less likely than other members to be represented in a sample of that population.

量表問題 Scaled questionA scaled question is a type of question featured in a survey that requires an individual to choose from a given range of possible responses.

散佈圖 ScatterplotA scatterplot uses plotted points (that are not connected by a line) to represent values of a dataset with one or more dependent variables and one independent variable.

序列圖 Series graphA series graph proportionally represents values of a dataset with two or more dependent variables and one independent variable.

序列 SeriesA series is a dataset that compares one or more dependent variables with one independent variable.

加深 ShadeShade refers to adding black to a hue in order to darken it.

偏斜資料 Skewed dataSkewed data are data with a non-normal distribution and tend to have more values to the left, as in left-skewed, or right, as in right-skewed, of the mean value when represented on a graph.

堆疊長條圖 Stacked bar graphA stacked bar graph is a type of bar graph whose bars are divided into sub-sections, each of which proportionally represent categories of data in a dataset that can be stacked together to form a larger category.

標準差 Standard deviationA standard deviation is a measure of how much the values in a dataset vary, or deviate, from the arithmetic mean by taking the square root of the variance.

靜態圖像 Static graphicA static graphic is a type of visualization designed for digital or print media that presents information without need for input from the viewer.

統計學 StatisticsStatistics is the study of collecting, measuring, and analyzing quantitative data using mathematical operations.

可和多系列 Summable multiseriesA summable multiseries is a type of multiseries with two or more dependent variables that can be added together and compared with an independent variable.

摘要記錄 Summary recordA summary record is a record in a database that has been sorted,or aggregated, in some way after having been collected.

淡化 TintTint refers to adding white to a hue in order to lighten it.

交易記錄 Transactional recordA transactional record is a record in a database that has not yet been sorted, or aggregated, after collection.

明度(色彩) Value (color)Value, or brightness, refers to the tint, shade, or tone of a hue that results black or white pigments to a base color.

變量 VarianceVariance, or statistical variance, is a measure of how spread out the numeric values in a dataset are, or how much the values vary, from the arithmetic mean.

Last updated