## What's new? |

## Outliers/High Leverage observations and influential points - 컴퓨터 |

In this section, we learn the distinction between outliers and high leverage observations. In short:

An outlier is a data point whose response

A data point has high leverage if it has "extreme" predictor

Note that — for our purposes — we consider a data point to be an outlier only if it is extreme with respect to the other y values, not the x values.

A data point is influential if it unduly influences any part of a regression analysis, such as the predicted responses, the estimated slope coefficients, or the hypothesis test results. Outliers and high leverage data points have the potential to be influential, but we generally have to investigate further to determine whether or not they are actually influential.

One advantage of the case in which we have only one predictor is that we can look at simple scatter plots in order to identify any outliers and influential data points. Let's take a look at a few examples that should help to clarify the distinction between the two types of extreme values.

from https://onlinecourses.science.psu.edu/stat501/node/337

An outlier is a data point whose response

**y**does not follow the general trend of the rest of the data.A data point has high leverage if it has "extreme" predictor

**x**values. With a single predictor, an extreme x value is simply one that is particularly high or low. With multiple predictors, extreme x values may be particularly high or low for one or more predictors, or may be "unusual" combinations of predictor values (e.g., with two predictors that are positively correlated, an unusual combination of predictor values might be a high value of one predictor paired with a low value of the other predictor).Note that — for our purposes — we consider a data point to be an outlier only if it is extreme with respect to the other y values, not the x values.

A data point is influential if it unduly influences any part of a regression analysis, such as the predicted responses, the estimated slope coefficients, or the hypothesis test results. Outliers and high leverage data points have the potential to be influential, but we generally have to investigate further to determine whether or not they are actually influential.

One advantage of the case in which we have only one predictor is that we can look at simple scatter plots in order to identify any outliers and influential data points. Let's take a look at a few examples that should help to clarify the distinction between the two types of extreme values.

from https://onlinecourses.science.psu.edu/stat501/node/337

written time : 2017-08-20 20:54:42.0

## qual 대비1 - 컴퓨터 |

stat 2

- 1-way RM ANOVA(wg(ssub, err))

- 2-way ANOVA

stat 3

- pooled proportion to SE(pooled q!!, sqrt((pooledP * pooledQ) / N))

- proportion: 2-way table

ml 0

- Forward, State Prob.

- EM: theta 는 emission

- GMM: Euclidean / variance 최소

- SVM(Maximal margin) -> 어떤점??

ml 1

- bootstrap : lim(1-1/n)^n = e^-1

ml 2

- WCV 는 2 곱하기 클러스터 분산

- Random Forest

- Effective df

- Stacked classifier

- Ensemble: p(majority)

- 1-way RM ANOVA(wg(ssub, err))

- 2-way ANOVA

stat 3

- pooled proportion to SE(pooled q!!, sqrt((pooledP * pooledQ) / N))

- proportion: 2-way table

ml 0

- Forward, State Prob.

- EM: theta 는 emission

- GMM: Euclidean / variance 최소

- SVM(Maximal margin) -> 어떤점??

ml 1

- bootstrap : lim(1-1/n)^n = e^-1

ml 2

- WCV 는 2 곱하기 클러스터 분산

- Random Forest

- Effective df

- Stacked classifier

- Ensemble: p(majority)

written time : 2017-08-18 23:44:14.0

## upsampling example 8khz to 16khz by using libsoxr - 컴퓨터 |

/* SoX Resampler Library Copyright (c) 2007-13 robs@users.sourceforge.net

* Licence for this file: LGPL v2.1 See LICENCE for details. */

/* Example 5: Variable-rate resampling (N.B. experimental). A test signal

* (held in a buffer) is resampled over a wide range of octaves. Resampled

* data is sent to stdout as raw, float32 samples. Choices of 2 test-signals

* and of 2 ways of varying the sample-rate are combined in a command-line

* option:

*

* Usage: ./5-variable-rate [0|1|2|3]

*/

#include <soxr.h>

/* SoX Resampler Library Copyright (c) 2007-13 robs@users.sourceforge.net

* Licence for this file: LGPL v2.1 See LICENCE for details. */

/* Common includes etc. for the examples. */

#include <assert.h>

#include <errno.h>

#include <limits.h>

#include <math.h>

#include <stddef.h>

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#define USE_STD_STDIO

#undef int16_t

#define int16_t short

#undef int32_t

#if LONG_MAX > 2147483647L

#define int32_t int

#elif LONG_MAX < 2147483647L

#error this programme requires that 'long int' has at least 32-bits

#else

#define int32_t long

#endif

#undef min

#undef max

#define min(x,y) ((x)<(y)?(x):(y))

#define max(x,y) ((x)>(y)?(x):(y))

#define AL(a) (sizeof(a)/sizeof((a)[0])) /* Array Length */

//#include "examples-common.h"

int main(int argc, char *arg[])

{

// int opt = argc <= 1? 2 : (atoi(arg[1]) & 3), saw = opt & 1,

// float ibuf[10 << OCTAVES], obuf[AL(ibuf)];

short ibuf[83480], obuf[640];

int i;//, wl = 2 << OCTAVES;

size_t ilen = AL(ibuf), need_input = 1;

size_t odone, total_odone, total_olen = 166960;

size_t olen1 = AL(obuf); /* Small block-len if fast-changing ratio */

soxr_error_t error;

/* When creating a var-rate resampler, q_spec must be set as follows: */

// soxr_quality_spec_t q_spec = soxr_quality_spec(SOXR_HQ, SOXR_VR);

/* The ratio of the given input rate and output rates must equate to the

* maximum I/O ratio that will be used: */

soxr_io_spec_t iospec = soxr_io_spec(SOXR_INT16_I, SOXR_INT16_I);

soxr_t soxr = soxr_create(8000, 16000, 1, &error, &iospec, NULL, NULL);

if (!error) {

USE_STD_STDIO;

/* Generate input signal, sine or saw, with wave-length = wl: */

// for (i = 0; i < (int)ilen; ++i)

// ibuf[i] = (float)(saw? (i%wl)/(wl-1.)-.5 : .9 * sin(2 * M_PI * i / wl));

FILE *file;

file = fopen("re8khz.raw", "rb");

fread(ibuf, sizeof(short), 83480, file);

fclose(file);

/* Set the initial resampling ratio (N.B. 3rd parameter = 0): */

// soxr_set_io_ratio(soxr, ioratio(0), 0);

/* Resample in blocks of size olen1: */

for (total_odone = 0; !error && total_odone < total_olen;) {

/* The last block might be shorter: */

size_t block_len = min(olen1, total_olen - total_odone);

/* Determine the position in [0,1] of the end of the current block: */

// double pos = (double)(total_odone + block_len) / (double)total_olen;

/* Calculate an ioratio for this position and instruct the resampler to

* move smoothly to the new value, over the course of outputting the next

* 'block_len' samples (or give 0 for an instant change instead): */

// soxr_set_io_ratio(soxr, ioratio(pos), block_len);

/* Output the block of samples, supplying input samples as needed: */

do {

size_t len = need_input? ilen : 0;

error = soxr_process(soxr, ibuf, len, NULL, obuf, block_len, &odone);

fwrite(obuf, sizeof(short), odone, stdout);

/* Update counters for the current block and for the total length: */

block_len -= odone;

total_odone += odone;

/* If soxr_process did not provide the complete block, we must call it

* again, supplying more input samples: */

need_input = block_len != 0;

} while (need_input && !error);

/* Now that the block for the current ioratio is complete, go back

* round the main `for' loop in order to process the next block. */

}

soxr_delete(soxr);

}

/* Diagnostics: */

fprintf(stderr, "%-26s %s; I/O: %s

", arg[0], soxr_strerror(error),

ferror(stdin) || ferror(stdout)? strerror(errno) : "no error");

return !!error;

}

* Licence for this file: LGPL v2.1 See LICENCE for details. */

/* Example 5: Variable-rate resampling (N.B. experimental). A test signal

* (held in a buffer) is resampled over a wide range of octaves. Resampled

* data is sent to stdout as raw, float32 samples. Choices of 2 test-signals

* and of 2 ways of varying the sample-rate are combined in a command-line

* option:

*

* Usage: ./5-variable-rate [0|1|2|3]

*/

#include <soxr.h>

/* SoX Resampler Library Copyright (c) 2007-13 robs@users.sourceforge.net

* Licence for this file: LGPL v2.1 See LICENCE for details. */

/* Common includes etc. for the examples. */

#include <assert.h>

#include <errno.h>

#include <limits.h>

#include <math.h>

#include <stddef.h>

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#define USE_STD_STDIO

#undef int16_t

#define int16_t short

#undef int32_t

#if LONG_MAX > 2147483647L

#define int32_t int

#elif LONG_MAX < 2147483647L

#error this programme requires that 'long int' has at least 32-bits

#else

#define int32_t long

#endif

#undef min

#undef max

#define min(x,y) ((x)<(y)?(x):(y))

#define max(x,y) ((x)>(y)?(x):(y))

#define AL(a) (sizeof(a)/sizeof((a)[0])) /* Array Length */

//#include "examples-common.h"

int main(int argc, char *arg[])

{

// int opt = argc <= 1? 2 : (atoi(arg[1]) & 3), saw = opt & 1,

// float ibuf[10 << OCTAVES], obuf[AL(ibuf)];

short ibuf[83480], obuf[640];

int i;//, wl = 2 << OCTAVES;

size_t ilen = AL(ibuf), need_input = 1;

size_t odone, total_odone, total_olen = 166960;

size_t olen1 = AL(obuf); /* Small block-len if fast-changing ratio */

soxr_error_t error;

/* When creating a var-rate resampler, q_spec must be set as follows: */

// soxr_quality_spec_t q_spec = soxr_quality_spec(SOXR_HQ, SOXR_VR);

/* The ratio of the given input rate and output rates must equate to the

* maximum I/O ratio that will be used: */

soxr_io_spec_t iospec = soxr_io_spec(SOXR_INT16_I, SOXR_INT16_I);

soxr_t soxr = soxr_create(8000, 16000, 1, &error, &iospec, NULL, NULL);

if (!error) {

USE_STD_STDIO;

/* Generate input signal, sine or saw, with wave-length = wl: */

// for (i = 0; i < (int)ilen; ++i)

// ibuf[i] = (float)(saw? (i%wl)/(wl-1.)-.5 : .9 * sin(2 * M_PI * i / wl));

FILE *file;

file = fopen("re8khz.raw", "rb");

fread(ibuf, sizeof(short), 83480, file);

fclose(file);

/* Set the initial resampling ratio (N.B. 3rd parameter = 0): */

// soxr_set_io_ratio(soxr, ioratio(0), 0);

/* Resample in blocks of size olen1: */

for (total_odone = 0; !error && total_odone < total_olen;) {

/* The last block might be shorter: */

size_t block_len = min(olen1, total_olen - total_odone);

/* Determine the position in [0,1] of the end of the current block: */

// double pos = (double)(total_odone + block_len) / (double)total_olen;

/* Calculate an ioratio for this position and instruct the resampler to

* move smoothly to the new value, over the course of outputting the next

* 'block_len' samples (or give 0 for an instant change instead): */

// soxr_set_io_ratio(soxr, ioratio(pos), block_len);

/* Output the block of samples, supplying input samples as needed: */

do {

size_t len = need_input? ilen : 0;

error = soxr_process(soxr, ibuf, len, NULL, obuf, block_len, &odone);

fwrite(obuf, sizeof(short), odone, stdout);

/* Update counters for the current block and for the total length: */

block_len -= odone;

total_odone += odone;

/* If soxr_process did not provide the complete block, we must call it

* again, supplying more input samples: */

need_input = block_len != 0;

} while (need_input && !error);

/* Now that the block for the current ioratio is complete, go back

* round the main `for' loop in order to process the next block. */

}

soxr_delete(soxr);

}

/* Diagnostics: */

fprintf(stderr, "%-26s %s; I/O: %s

", arg[0], soxr_strerror(error),

ferror(stdin) || ferror(stdout)? strerror(errno) : "no error");

return !!error;

}

written time : 2017-08-16 23:04:17.0