Problems 4.1-4.6
Data Collection

1. For the network boundary problem (Problem 5) in Chapter 3, provide examples of questions that a researcher might ask to collect network data. In addition, discuss the advantages and disadvantages of open- versus closed-ended question formats in each case.

A study of fraternities at a medium-sized Midwestern university

  • For each fraternity, bound the network by the list of active members. In consultation with active members, decide whether to include alumni, pledges, etc.

  • More ambitiously, bound the network by the list of all people who are members of any fraternity on campus. Then study ties within and between fraternities.

  • Once a final list of names is determined, use a closed-ended format (i.e., a roster) to collect the network data

b. The study of an informal activist group in an urban neighborhood

  • Ask representatives of the group for a list of people they consider members. Drop candidates that are mentioned by just one or two informants.

  • As above, once a final list of names is determined, use a closed-ended format (i.e., a roster) to collect the network data

c. The network relations among active hunters in a small village in the Amazon

  • The names of the most active hunters could be elicited from village members and then bounded by degree of activity (e.g., only the most active hunters)

  • See comments in 1c about using a roster

d. Relationships among non-governmental organizations involving a dam project in West Africa

  • Ask people working on the project, as well as others living in the area, for the names of NGOs working there. Keep only names mentioned by several informants.

  • See comments in 1c about using a roster

e. Food-sharing networks in a village in Central Asia

  • Try to enlist all households in the village. You will need the villagers themselves to help you determine who qualifies as a member of the village, but there will be disagreements that you will have to work out.

  • See comments in 1c about using a roster

f. The political network of community activists in a moderate-sized city

  • Identify a seed group of known activists (e.g., check newspapers). Then ask them for the names of other activists.

  • See comments in 1c about using a roster

2. Network data can be extracted from both archival and electronic data sources. Provide an example of social network data that can be collected from each source.

  • The distinction between archival and electronic data sources is outdated, as ‘archival’ data can be stored electronically. However, an example of archival data would be minutes from meetings in which the attendees are listed. These would yield a 2-mode person by meeting matrix. Across many meetings, we can then compute the number of times that each pair of persons attended the same meeting.

  • An example of ‘electronic’ data might be a ‘who follows whom’ network on Twitter.

3. When designing a social network survey instrument, what are some of the ways you can reduce respondent burden?

  • Consider asking only network questions that are directly relevant to your study (i.e., avoid fishing expeditions!).

  • In some cases, with extremely large networks, a fixed choice approach ('Name 3 people who ...') can cut down on the number of decisions and choices a respondent will have to make.

  • Split the task over multiple days

4. Chris, a social network researcher, is interested in the relationship between people’s frequency of recreational interactions and their political attitudes. Provide examples of the kinds of questions Chris might ask, using both an absolute and relative format for eliciting tie frequency. See Table 4.3 for examples


Absolute format


For each person listed below, please indicate how often you go out and do fun things with them, such as hiking or playing tennis. In each case, use the following response scale:

  1. Almost never

  2. Once a year

  3. Every six months

  4. Every three months

  5. Every month or so

  6. Every couple of weeks

  7. Every week

  8. More than once a week


Relative format


For each person listed below, please indicate how often you go out and do fun things with them, such as hiking or playing tennis. In each case, use the following response scale:


  1. Never

  2. Very infrequently

  3. Somewhat infrequently

  4. Neither frequently nor infrequently

  5. Somewhat frequently

  6. Very frequently

  7. Constantly


5. Apolline studies organizational behavior and is interested in understanding the relationship between frequency of interaction among employees in a corporate headquarters with 355 employees and their attitudes about corporate culture. She wants to use a closed-ended multigrid question format in her study. Discuss the advantages and disadvantages of this approach in this case and provide some suggestions of ways to reduce respondent burden.


Responding to a grid of 355 employees across multiple network questions could prove to be a daunting to any respondent. If the corporate headquarters is divided up into functional units (e.g., shipping, IT, legal department) this might be a way of reducing respondent burden. The list of employees can be aggregated into functional units and the network questions can initially be directed to the functional unit and then employees within units. So, for example, if one of the network questions involves advice seeking, the initial question could be ‘Do you have ties with anyone in “Shipping”?’ If they don’t have ties with anyone in shipping move on to the next unit. If they do, then provide the list for people in that unit, and so on.

Although spatially compact, using a grid of this size could add to the respondent's feeling of being overwhelmed. It might be better to ask each network question sequentially. Ideally, the questions can be ordered in such as way that a later question is only ask about actors the person has already mentioned in a previous question.

6. Construct your own social media topic networks by repeating Practice 4.1 but by selecting your own subjects for tweets, videos or Reddit comments.

Select a newsworthy topic and simple substitute this into the R-code. It would also be a good idea to change the number of nodes in the final graph. When constructing these answers Tonga was in the news as it had been cut-off from the internet as the cable connecting it to the rest of the world had been severed. So we find a Twitter network in which Tweets contain “Tonga” and taking say 250 Tweets. If you do not have Twitter API keys then you need the libraries rtweet and vosonSML, this plot uses igraph so this also needs to be installed . The code would then be. Note it is just the first line that has been changed.


twitterdata<-search_tweets(q="Tonga",n=250)

class(twitterdata) <- append(class(twitterdata), c("datasource", "twitter"))

actorNetwork <- twitterdata %>% Create("actor", writeToFile = TRUE, verbose = TRUE)

actorGraph <- actorNetwork %>% Graph(writeToFile = TRUE)

plot(actorGraph,vertex.label="",vertex.size=5,vertex.color="red",edge.arrow.size=0.1,layout=layout_with_kk)