Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

Opening subject page...

Loading your content

AP Statistics

AP Statistics Help: Sampling For Differences In Sample Proportions

Review real example questions for Sampling For Differences In Sample Proportions in AP Statistics.

Question 1

A university compares the proportion of students who report being satisfied with dining services between on-campus and off-campus students. An SRS of n1=90n_1=90n1​=90 on-campus students and an independent SRS of n2=110n_2=110n2​=110 off-campus students are surveyed; p^1\hat p_1p^​1​ and p^2\hat p_2p^​2​ are the sample proportions satisfied. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is modeled for repeated sampling. Which statement is correct?

  1. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ describes how p1−p2p_1-p_2p1​−p2​ varies from sample to sample.
  2. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ describes how p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ varies from sample to sample.
  3. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is the distribution of individual responses (satisfied vs not) pooled across both groups.
  4. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is the same as the distribution of p^1\hat p_1p^​1​ alone because both are proportions.
  5. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ cannot be centered at p1−p2p_1-p_2p1​−p2​ unless n1=n2n_1=n_2n1​=n2​.
Explanation: This question distinguishes between what varies in a sampling distribution. The sampling distribution of p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ describes how the sample statistic p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ varies from sample to sample when we repeat the sampling process. The population parameters p1p_1p1​ and p2p_2p2​ are fixed and don't vary (eliminating A). It's not about individual responses but about the difference in proportions (eliminating C). The distributions of p^1\hat{p}_1p^​1​ alone and p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ are different (eliminating D). The center is p1−p2p_1 - p_2p1​−p2​ regardless of whether sample sizes are equal (eliminating E).

Question 2

A tech company compares the proportion of users who click a new button design on two versions of an app. Version A is shown to an independent random sample of n1=250n_1=250n1​=250 users and Version B to n2=250n_2=250n2​=250 users; p^1\hat p_1p^​1​ and p^2\hat p_2p^​2​ are the sample click proportions. The company considers the sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ over many repetitions. Which statement is correct?

  1. If p1−p2p_1-p_2p1​−p2​ is positive, then p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ must be positive in every sample.
  2. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is centered at p1−p2p_1-p_2p1​−p2​, and its spread depends on p1p_1p1​, p2p_2p2​, n1n_1n1​, and n2n_2n2​.
  3. The sampling distribution is centered at p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ and its spread depends only on n1+n2n_1+n_2n1​+n2​.
  4. The sampling distribution describes the difference between the two populations, not the difference between the two sample proportions.
  5. The sampling distribution cannot be used unless the two populations have the same size.
Explanation: This question addresses both center and spread of the sampling distribution. The sampling distribution of p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ is centered at the population difference p1−p2p_1 - p_2p1​−p2​, and its spread (standard deviation) depends on all four values: p1p_1p1​, p2p_2p2​, n1n_1n1​, and n2n_2n2​ through the formula p1(1−p1)n1+p2(1−p2)n2\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}n1​p1​(1−p1​)​+n2​p2​(1−p2​)​​. Even if p1−p2>0p_1 - p_2 > 0p1​−p2​>0, sampling variability means some samples could yield negative differences (eliminating A). The center is not the sample statistic (eliminating C). The distribution describes sample statistics, not populations (eliminating D). Population sizes aren't relevant to the sampling distribution (eliminating E).

Question 3

A political analyst compares the proportion of voters who favor Candidate X in two counties. An independent random sample of n1=500n_1=500n1​=500 registered voters from County 1 and n2=500n_2=500n2​=500 from County 2 is taken; p^1\hat p_1p^​1​ and p^2\hat p_2p^​2​ are the sample proportions favoring Candidate X. Over repeated sampling, consider the sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​. Which statement is correct?

  1. With large sample sizes, p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ will equal p1−p2p_1-p_2p1​−p2​ in every repetition.
  2. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is approximately normal if the success–failure condition is met in both groups.
  3. The sampling distribution is approximately normal only when p1p_1p1​ and p2p_2p2​ are both close to 0.5.
  4. The sampling distribution is approximately normal because the population distributions must be normal.
  5. The sampling distribution has no spread because the sample sizes are equal and large.
Explanation: This question tests understanding of normality conditions for sampling distributions. The sampling distribution of p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ is approximately normal when the success-failure condition is met in both groups (typically np≥10np \geq 10np≥10 and n(1−p)≥10n(1-p) \geq 10n(1−p)≥10 for each group). Large samples don't eliminate all variability (eliminating A). Normality doesn't require proportions near 0.5 (eliminating C). The population distributions don't need to be normal for the sampling distribution to be approximately normal (eliminating D). There is still spread due to sampling variability regardless of sample size (eliminating E).

Question 4

A school district wants to compare support for a new start-time policy between two groups of parents. A random sample of n1=80n_1=80n1​=80 elementary-school parents and an independent random sample of n2=120n_2=120n2​=120 high-school parents are surveyed; in each group, the sample proportion who support the policy is recorded as p^1\hat p_1p^​1​ and p^2\hat p_2p^​2​. If these sampling methods were repeated many times, the distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ would be approximately normal with mean p1−p2p_1-p_2p1​−p2​ and standard deviation p1(1−p1)n1+p2(1−p2)n2\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}n1​p1​(1−p1​)​+n2​p2​(1−p2​)​​ (assuming conditions are met). Which statement is correct?

  1. The mean of the sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ from this one set of samples.
  2. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ has no variability because n1n_1n1​ and n2n_2n2​ are fixed.
  3. The mean of the sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is p1−p2p_1-p_2p1​−p2​.
  4. The standard deviation of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is p^1(1−p^1)n1+p^2(1−p^2)n2\sqrt{\frac{\hat p_1(1-\hat p_1)}{n_1}+\frac{\hat p_2(1-\hat p_2)}{n_2}}n1​p^​1​(1−p^​1​)​+n2​p^​2​(1−p^​2​)​​ exactly, for all samples.
  5. The sampling distribution of p^1−p^2\hat p_1-\hat p_2p^​1​−p^​2​ is centered at 0 whenever n1≠n2n_1\neq n_2n1​=n2​.
Explanation: This question tests understanding of the sampling distribution of the difference in sample proportions. The sampling distribution of p^1−p^2\hat{p}_1 - \hat{p}_2p^​1​−p^​2​ describes how this difference varies across many repeated samples. Its mean (center) is the true population difference p1−p2p_1 - p_2p1​−p2​, not the observed sample difference from one particular sample (eliminating A). The distribution does have variability even with fixed sample sizes because different samples yield different proportions (eliminating B). The standard deviation formula uses population proportions p1p_1p1​ and p2p_2p2​, not sample proportions (eliminating D). The center is p1−p2p_1 - p_2p1​−p2​ regardless of whether sample sizes are equal (eliminating E).

Question 5

A researcher compares the proportion of plants that survive under two fertilizers. Fertilizer A: random sample of nA=40n_A=40nA​=40 plants; Fertilizer B: independent random sample of nB=40n_B=40nB​=40 plants. The statistic is p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​. Consider the condition for using a normal approximation for the sampling distribution of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​. Which statement is correct?

  1. A normal approximation is appropriate only if pA=pBp_A=p_BpA​=pB​.
  2. A normal approximation is appropriate if each sample has at least 10 expected successes and 10 expected failures: nApA, nA(1−pA), nBpB, nB(1−pB)≥10n_Ap_A,\,n_A(1-p_A),\,n_Bp_B,\,n_B(1-p_B)\ge 10nA​pA​,nA​(1−pA​),nB​pB​,nB​(1−pB​)≥10.
  3. A normal approximation is appropriate whenever nA+nB≥30n_A+n_B\ge 30nA​+nB​≥30.
  4. A normal approximation is never appropriate for a difference of two sample proportions.
  5. A normal approximation is appropriate if p^A\hat{p}_Ap^​A​ and p^B\hat{p}_Bp^​B​ from one sample are both between 0.4 and 0.6.
Explanation: This question assesses conditions for normality in the sampling distribution of proportion differences, vital in AP Statistics for valid inferences. The normal approximation holds if each group has at least 10 expected successes and failures, ensuring the individual proportion distributions are mound-shaped. Choice C is a distractor, oversimplifying to total sample size >=30 without checking success-failure counts per group. Mini-lesson: for (\hat{p}_A - \hat{p}_B) from independent samples, normality requires (n p geq 10) and (n(1-p) geq 10) for each, allowing the difference to be approximately normal centered at (p_A - p_B) with calculable spread. This isn't guaranteed by equal proportions or observed values alone. Proper checks prevent skewed distributions in analysis.

Question 6

A tech company compares the proportion of users who enable two-factor authentication on two platforms. Platform W: SRS of nW=1000n_W=1000nW​=1000 users; Platform M: independent SRS of nM=1000n_M=1000nM​=1000 users. The statistic is p^W−p^M\hat{p}_W-\hat{p}_Mp^​W​−p^​M​. Suppose pWp_WpW​ and pMp_MpM​ stay the same over time and the sampling method stays the same. Which statement is correct about what would happen to the sampling distribution if both sample sizes were reduced to nW=nM=100n_W=n_M=100nW​=nM​=100?

  1. The sampling distribution would have the same center but typically larger standard deviation (more spread).
  2. The sampling distribution would shift its center from pW−pMp_W-p_MpW​−pM​ to p^W−p^M\hat{p}_W-\hat{p}_Mp^​W​−p^​M​.
  3. The sampling distribution would have smaller spread because smaller samples are less variable.
  4. The sampling distribution would become centered at 0 regardless of pW−pMp_W-p_MpW​−pM​.
  5. The sampling distribution would have no change because the populations did not change.
Explanation: This question investigates how changing sample sizes affects the sampling distribution of proportion differences, a practical AP Statistics concept for study design adjustments. Reducing sizes keeps the center at (p_W - p_M) but increases spread, as the SD grows with smaller n in the formula. Choice C is a distractor, incorrectly stating smaller samples reduce variability, when the opposite is true due to less information. Mini-lesson: for independent samples, the difference distribution's center is invariant to sample size, but spread inversely relates to n, so halving sizes widens it without shifting to observed values or zero. Populations staying the same doesn't negate size effects. This informs trade-offs in precision versus feasibility.

Question 7

Two independent random samples are taken to compare the proportion of voters who support a ballot measure in two counties. County X: nX=60n_X=60nX​=60; County Y: nY=60n_Y=60nY​=60. The statistic is p^X−p^Y\hat{p}_X-\hat{p}_Yp^​X​−p^​Y​. Suppose the true population proportions are pX=0.50p_X=0.50pX​=0.50 and pY=0.50p_Y=0.50pY​=0.50. Over many repetitions, which statement is correct about the sampling distribution of p^X−p^Y\hat{p}_X-\hat{p}_Yp^​X​−p^​Y​?

  1. Its mean is 0, but it will still vary from sample to sample.
  2. Its mean is 0, and it will always equal 0 in repeated samples.
  3. Its mean equals p^X−p^Y\hat{p}_X-\hat{p}_Yp^​X​−p^​Y​ from the first pair of samples.
  4. Its mean must be positive because sample proportions are always between 0 and 1.
  5. Its mean is 1 because the two sample proportions add to 1.
Explanation: This question probes the properties of the sampling distribution for proportion differences when population proportions are equal, relevant in AP Statistics for null hypothesis scenarios. With (p_X = p_Y = 0.5), the center is 0, but spread exists due to sampling variability, so (\hat{p}_X - \hat{p}_Y) fluctuates around 0 in repeated samples. Choice B is a distractor, falsely implying no variability when the mean is 0, ignoring that even equal proportions yield differing sample outcomes by chance. Mini-lesson: for independent samples, the distribution of (\hat{p}_1 - \hat{p}_2) is centered at (p_1 - p_2) (here 0) with standard deviation reflecting sample sizes and proportions, always showing spread unless samples are infinite. This variability is key for understanding p-values in tests of equal proportions. Proportions being between 0 and 1 doesn't force the difference to be positive.

Question 8

A company compares the proportion of customers who renew a subscription under two email campaigns. From Campaign 1, an SRS of n1=80n_1=80n1​=80 customers is selected; from Campaign 2, an independent SRS of n2=320n_2=320n2​=320 customers is selected. The statistic of interest is p^1−p^2\hat{p}_1-\hat{p}_2p^​1​−p^​2​. Over many repetitions, the sampling distribution of p^1−p^2\hat{p}_1-\hat{p}_2p^​1​−p^​2​ is centered at p1−p2p_1-p_2p1​−p2​ and has standard deviation p1(1−p1)n1+p2(1−p2)n2\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}n1​p1​(1−p1​)​+n2​p2​(1−p2​)​​. Which statement is correct?

  1. The sampling distribution of p^1−p^2\hat{p}_1-\hat{p}_2p^​1​−p^​2​ is centered at 0 whenever n1≠n2n_1\neq n_2n1​=n2​.
  2. The sampling distribution of p^1−p^2\hat{p}_1-\hat{p}_2p^​1​−p^​2​ is centered at p1−p2p_1-p_2p1​−p2​.
  3. Because n2n_2n2​ is larger than n1n_1n1​, the sampling distribution depends only on n2n_2n2​.
  4. The sampling distribution has no spread if the samples are independent.
  5. The center of the sampling distribution is p^1−p^2\hat{p}_1-\hat{p}_2p^​1​−p^​2​ for any particular pair of samples.
Explanation: This question evaluates knowledge of the sampling distribution for differences in sample proportions, essential in AP Statistics for comparing success rates between groups like subscription renewals. The distribution is centered at (p_1 - p_2) with spread determined by (\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}), which accounts for unequal sample sizes without the larger one dominating entirely. Choice E is a distractor, wrongly claiming the center is the observed difference, which varies per sample while the true center remains the population parameter. Mini-lesson: the sampling distribution of (\hat{p}_1 - \hat{p}_2) from independent samples models how this statistic behaves over many repetitions, always centering on the fixed (p_1 - p_2) but with variability that combines the uncertainties from each sample. Larger samples reduce spread, improving reliability for hypothesis tests or confidence intervals. Even with (n_2 > n_1), both contribute to the overall variance.

Question 9

A streaming service compares the proportion of users who finish a new series in two regions. Region R: an SRS of nR=500n_R=500nR​=500 users; Region S: an independent SRS of nS=50n_S=50nS​=50 users. The statistic is p^R−p^S\hat{p}_R-\hat{p}_Sp^​R​−p^​S​. In repeated sampling, the sampling distribution is centered at pR−pSp_R-p_SpR​−pS​ and its standard deviation is influenced by both sample sizes. Which statement is correct?

  1. The sampling distribution will be more variable because nRn_RnR​ is large.
  2. The sampling distribution’s variability is driven more by the smaller sample size nSn_SnS​ than by the larger nRn_RnR​.
  3. The sampling distribution depends only on pR−pSp_R-p_SpR​−pS​, not on nRn_RnR​ or nSn_SnS​.
  4. Because nR≠nSn_R\neq n_SnR​=nS​, the sampling distribution cannot be approximately normal.
  5. The sampling distribution is centered at p^R−p^S\hat{p}_R-\hat{p}_Sp^​R​−p^​S​ regardless of the true pR−pSp_R-p_SpR​−pS​.
Explanation: This question examines how unequal sample sizes influence the sampling distribution of proportion differences, a critical AP Statistics concept for real-world studies with varying group sizes. The spread is more affected by the smaller sample ((n_S = 50)) because its term in the standard deviation formula contributes more variance relative to the larger (n_R = 500). Choice A distracts by claiming more variability due to the large sample, which is backward since larger samples reduce individual variance contributions. Mini-lesson: in difference distributions from independent samples, the total spread combines variances additively, with smaller samples driving more uncertainty, while the center remains at (p_R - p_S). Unequal sizes don't prevent normality if conditions are met. This highlights the importance of bolstering smaller groups for balanced precision.

Question 10

A public health researcher wants to compare the proportion of adults who got a flu shot this year in two cities. A simple random sample of n1=200n_1=200n1​=200 adults is taken from City A and n2=200n_2=200n2​=200 adults is taken from City B, and the statistic p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ is computed. If these samples were repeatedly taken the same way, the sampling distribution of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ would be approximately normal and centered at pA−pBp_A-p_BpA​−pB​, with a standard deviation that depends on pAp_ApA​, pBp_BpB​, n1n_1n1​, and n2n_2n2​. Which statement is correct?

  1. The sampling distribution of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ is centered at the observed sample difference p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ from this one set of samples.
  2. If n1n_1n1​ and n2n_2n2​ are increased, the sampling distribution of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ becomes wider because larger samples vary more.
  3. The sampling distribution of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ is centered at the true difference in population proportions pA−pBp_A-p_BpA​−pB​.
  4. Because two samples are taken, p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ has no sampling variability and will be the same in repeated samples.
  5. The standard deviation of p^A−p^B\hat{p}_A-\hat{p}_Bp^​A​−p^​B​ depends only on n1n_1n1​ and n2n_2n2​, not on pAp_ApA​ or pBp_BpB​.
Explanation: This question assesses understanding of the sampling distribution for the difference in two sample proportions, a key concept in AP Statistics for comparing categorical data from two groups. The sampling distribution of p^A−p^B\hat{p}_A - \hat{p}_Bp^​A​−p^​B​ is centered at the true population difference (pA−pB)(p_A - p_B)(pA​−pB​), not at the observed sample difference, and its spread is given by the standard deviation formula that incorporates both population proportions and sample sizes. A common distractor is choice A, which incorrectly suggests the distribution is centered at the observed difference from one sample, confusing the sample statistic with the parameter. In a mini-lesson on difference distributions: when taking independent random samples from two populations, the difference in sample proportions is an unbiased estimator of the true difference, meaning repeated samples will produce values varying around (pA−pB)(p_A - p_B)(pA​−pB​) with a normal shape under large sample conditions. The spread decreases as sample sizes increase, reflecting more precise estimates. This centering at the parameter ensures that inferences about the population difference are valid.