We consider the role of infants’ pre-linguistic vocalizations in relation to caregiver input. We analyze a cross-sectional sample to investigate whether early production is driven by infants’ perception of segments matching their own phonological capacity. We analyzed 44 infants’ consonant productions from an hour of home-recorded video data taken at 10-11 months. Each consonant the infant produced was transcribed, alongside any acoustically salient word produced by the caregiver in the preceding 15s. We coded whether infants’ consonant productions were phonetically congruent with caregiver input (e.g. mother says ‘ball’ 5s before infant produces /bə/). Compared with scrambled parental-production data, the proportion of parent-matching infant productions was significantly above chance. Furthermore, infants with more stable consonant productions (‘vocal motor schemes’) responded with congruent consonants significantly more often if that consonant was established in their phonological inventory. These findings have implications for understanding early lexical learning in the context of the perception-production interface.