This work is a reflection on my own experiences of collaborative data science work, using the DEBRIEF framework. It is adapted from my report initially submitted in March 2022.
Describe events ‘Six students, including me, worked together on the Group Project in the Data Science module. Most of us had never met before and we had not previously worked together. I was initially working from home due to various circumstances. The other five were present in person. We were working on the Open University Learning Analytics Dataset (OULAD). The question we examined was: ‘How can data about students’ interactions with the University educational technology help them improve students’ engagement and academic achievement?’. I focused on student withdrawals. We worked with R Markdown. The choice to work with R was made in my absence by other students, as those students felt most confident using R. GitHub was used for version control. My ideas included looking at withdrawal dates and using a logistic regression model to predict which students were likely to withdraw. I suggested the ‘sum of clicks’ and ‘score’ as possible predictive factors for outcome. Others were supportive of these ideas; we each wrote part of the code to implement them. Another student suggested the ‘number of late submissions’ as a predictive factor and this was also included. We did EDA individually, before bringing our results together to build a predictive model and report our findings (Baraskar et al.: 2022).
Evaluation ‘ While working remotely, I repeatedly asked for team calls to be scheduled and for help with file management in R, but unfortunately, these things did not happen. It soon became clear that traveling to Newcastle was necessary. Attending in person made it much easier to discuss things such as division of work, as well as to ask for help with any R issues. Collaborating using the GitHub branch system usually worked well, but this way of working was new to me. I was sometimes unsure of what to do or whether what I had done had worked, partly due to remote working. Two people worked on the same sub-tasks for several days before realizing it and one person worked on a topic that was out of the scope of the project. However, using GitHub also means our research can later be further developed, by ourselves or others. As Poduska (2020) states: ‘[] you’re not only collaborating with people right now: You’re collaborating with people in the future’.
Now I have experience of collaborative working on GitHub, I am keen to work in this way again. Guides to working with GitHub are available such as Mines (2018). In data science, collaborative working is increasing (Zhang et al. 2020:1) and it was valuable to get this experience.
Technology such as Teams and WhatsApp enabled us to hold group meetings and communicate from different locations. Technology can be prone to failure and perhaps our group did not always use technology in the best possible ways. As well, connecting virtually is not the same as, or as good as, connecting in person.
It was great to work together to find solutions, make new connections with other students, and further develop teamwork and communication skills that are highly valued by employers. As a group we benefited from a flowchart about selecting the right type of predictor (Bevans 2020); a team member had found this online. Other team members provided interesting perspectives on areas including Maths and Software Engineering, which improved business understanding. At first, one person could not see the point of looking at withdrawals, but I pointed out that a large part of improving students’ engagement is preventing withdrawals.
I am very much an ‘ideas person’ and it was great to work together with others who were supportive of these ideas and were better than I at writing R code. Students who were strong in R spent time helping me. This was time they did not spend doing technical work, but without their help, I could not have done my own technical work. Conversely, I helped those whose first language is not English.
Bring out emotions etc. ‘ I felt isolated during the early days of this project. I did not feel involved in the data management process, decision-making, or team discussions. However, I found the subject area interesting. I was also glad to work in a sub-area of the task that intrigued me. I felt happy when our group work improved significantly towards the end of the project. On one occasion, I felt powerless and confused when there was a conflict between two other group members. This conflict took place on our group Teams chat. In this situation, I was unsure of what action to take, but I decided to wait and see how things developed. I did not get involved in the situation. The conflict seems to have been short-lived and I was relieved when it got resolved. This showed me that the ‘fight, flight or freeze’ response (Cannon 1915, quoted in Frothingham 2021) is a very real thing. When the conflict occurred, my response was to freeze. If I face a similar situation again, perhaps there will not be much I can do about my initial reaction to freezing, but maybe I could prepare a list of resources that I or others could use. These could include Brett and Goldberg (2017), Lewitter et al. (2019), and Indeed (2021). Perhaps in the future, I can consider whether to act in the same way again or become actively involved in conflict resolution. My decision will probably depend on the situation.
Review in light of previous experience ‘ I have worked in remote international teams as a freelance translator since 2011 and before that, as a charity worker from 2007 to 2011. Usually, teams work well remotely but sometimes there are signs that my presence in person is needed. As several of these signs were present during the early days of this project, I traveled to Newcastle for some days. It was helpful to attend in person and this seems to have been ‘the right call’.
Perhaps I could have made greater efforts to educate the other group members about ways of working in remote or hybrid teams that make sure everyone is included. I do not know how much experience the other group members have of working in remote or hybrid teams. Some of the reasons why I did not further emphasize my own experience were that I did not want to ‘show off’ and I did not want to be perceived as being ‘old’ (even though I almost certainly was older than most or all other group members). However, I asked several times to be better included in the group (while working remotely) and unfortunately, this was fruitless. It is difficult (and possibly pointless) to speculate why this was the case. As Rosling (2018:206) states: ‘The blame instinct is the instinct to find a simple, clear reason for why something bad has happened’. But according to Dekker (2006:73): ‘The reality is that there is no such thing as the cause, or primary cause or root cause. Cause is something we construct, not find’ (italics are Dekker’s).
Identify lessons learned ‘ If I am present in person when others are working remotely, I hope to go out of my way to include the remote workers, as the remote workers are not able to unilaterally include themselves. If someone else is sick and has hospital appointments while I am healthy and have a more flexible schedule, I hope to adjust my schedule to suit them, because that would be something I could do, but not something that they could. If someone is genuinely struggling to code, and I know how to write that code, I shall aim to set time aside to work with them.
On a different note, the choice to allow group members to select areas that they are interested in worked well and is something I would hope to repeat in future projects.
I realized near the end of this project that I could have contacted one of the other group members, who had not been attending in person and who had not submitted any code by the end of the first week. I could have checked how they were doing and what they were working on. I didn’t need anyone’s permission to do that, and I shouldn’t have let remote working stop me. However, I had become too over-focussed on my own situation and my own desire to be fully included in the group. I simply couldn’t look away from myself to see that others might also be struggling and that they might have appreciated it had I reached out to them. The group dynamics became more apparent after I had started to attend in person. When working remotely, it is difficult to know who else is also not present in person or what the in-person interactions among other group members have been like. So, I must remind myself that there are many things I don’t know, and that ‘I don’t know what I don’t know’. Following Broadwell’s (1969) framework, this could be viewed as moving from the ‘unconscious Incompetent’ to the ‘Conscious Incompetent’. Broadwell originally used these terms to describe levels of teaching practice, but the terms could also apply to levels of awareness or understanding.
I found that when working in remote or hybrid teams, much time must be allocated to group meetings and communication within the team. Meetings take time away from the technical work but are absolutely necessary. Remote working (as I have also previously experienced) takes away the constant conversations in the workplace, be these process-related or social. Being the only remote worker made it difficult for me to influence any aspect of the project work. Working remotely and individually on EDA meant I did not know how my work fit into the overall picture or whether I was replicating anyone else’s work. It also made it harder for us to develop creative ideas together.
Establish follow-up actions ‘ In the future I will try to clarify early on whether my presence in person will be needed. If there is anything that prevents my presence in person, I will be meticulous about making arrangements for successful remote work. I will also try to find out who to contact and what to do if there are any issues with working remotely. I will continue to praise and encourage those group members who demonstrate genuine teamwork. In future projects, if I face any difficulties, I shall aim to communicate these clearly, until these have been heard and I have been able to work out a way forward together with the other team members.
I aim to invest a lot of time into learning how to code better in R and how to use GitHub, as these things will help me to be a more independent and competent practitioner in this field. While some degree of interdependence among team members is necessary and good, I no longer wish to have to ask for so much help on future projects.
In future projects, it would be beneficial to clarify, early on, what everyone’s strengths are. This would help with the group working process and the division of tasks.
It might be interesting to attempt the same problem using Python, as Python programming skills are valuable. However, there was insufficient time for this in the two weeks of the project.
Feedback on those actions ‘ It seemed to be appreciated when I responded in kind and caring ways when other group members wrote in the WhatsApp chat that they had to be absent (due to a religious festival and an accident). However, the project has aims and a deadline so maybe I should have tried to encourage people to work at other times to ‘make up for lost time’ after they had had to be absent. Kindness and meeting deadlines might appear to be conflicting aims but there are ways to successfully manage both these aims together. The schedule and the total number of hours that someone works could be less important than whether they are keeping up with their workload and meeting deadlines. A recent Government report (Government Equalities Office 2019:18) found that:
‘[] offering flexible working helped retain staff, fostered loyalty and attachment to the organization and improved staff wellbeing, which in turn made them more effective workers.’
In future projects, if I am unable to work for part of the weekday daytimes, then I will aim to work some weekends and evenings, as I did on this project while tracking the hours that I work. If the person who is unable to work (or unable to be present in person) for some or all of the weekday daytimes is someone else, I shall aim to keep in touch with them. I will do this even if I am working remotely, working different hours, or both. The reasons for keeping in touch would be both to make sure that the other person is okay and to check whether they are keeping up with the project workload.
Bibliography
- Allan, Hayley (undated) A Reflective Tool for Workplace Learning. Accessed at: DEBRIEF-A-Reflective-Tool-for-Workplacelearning.docx (live.com), last accessed 18 May 2022.
- Baraskar, Mayank Prakash, et al. (2022), CSC8633_202122_Group10 repository, on GitHub, NewcastleDataScienceCSC8633_202122_Group10 (github.com), last accessed 22 May 2022.
- Bevans, Rebecca (2020) Choosing the Right Statistical Test |Types and Examples, accessed at Choosing the Right Statistical Test | Types and Examples (scribbr.com), last accessed 26 May 2022.
- Brett, Jeanne M., and Stephen P. Goldberg (2017) ‘How to Handle a Disagreement on Your Team’, in Harvard Business Review, accessed at How to Handle a Disagreement on Your Team (hbr.org), last accessed 20 May 2022.
- Broadwell, Martin M. (1969) ‘Teaching for Learning (XIV)’ in The Gospel Guardian, accessed at: Teaching For Learning (XVI.) – Gospel Guardian vol.20, no.41, pg.1-3a (wordsfitlyspoken.org), last accessed 20 May 2022.
- Cannon, Walter (1915) Bodily changes in pain, hunger, fear, and rage, quoted in Frothingham, Mia Belle, ‘Fight, Flight, Freeze or Fawn ‘ What This Response Means’, in SimplyPsychology, accessed at: Fight, Flight, Freeze, or Fawn: How We Respond to Threats – Simply Psychology, last accessed 20 May 2022.
- Dekker, Sidney (2006) The Field Guide to Understanding Human Error. Aldershot: Ashgate.
- Government Equalities Office (2019) Flexible working qualitative analysis: Organisations’ experience of flexible working arrangements, accessed at Flexible working qualitative analysis (publishing.service.gov.uk), last accessed 22 May 2022.
- Indeed Editorial Team (2021) ‘What Are Conflict Resolution Skills? Definitions and Examples’, accessed at What Are Conflict Resolution Skills? | Indeed.com, last accessed 20 May 2022.
- Lewitter, Fran, et al. (2017) ‘Ten simple rules for avoiding and resolving conflicts with your colleagues’, in PLOS Computational Biology, accessed at Ten Simple Rules for avoiding and resolving conflicts with your colleagues | PLOS Computational Biology, last accessed 20 May 2022.
- Mines, Jonathan (2018) ‘The Ultimate GitHub Collaboration Guide’, accessed at The Ultimate GitHub Collaboration Guide | by Jonathan Mines | Medium, last accessed 22 May 2022.
- Open University Learning Analytics dataset, accessed at Open University Learning Analytics dataset | Scientific Data (nature.com), last accessed 20 May 2022.
- Poduska, John (2020) ‘Best Practices for Collaborative Data Science: Five ways to help ensure projects deliver real business value’, in Towards Data Science, accessed at Best Practices for Collaborative Data Science | by Josh Poduska | Towards Data Science, last accessed 20 May 2022.
- Rosling, Hans (2018) Factfulness.: Sceptre
- Zhang, Amy X., et al. (2020) ‘How do Data Science Workers Collaborate? Roles, Workflows and Tools’, in Proc. ACM. Hum.-Comput. Interact., accessed at: How do Data Science Workers Collaborate? Roles, Workflows, and Tools (arxiv.org), last accessed 22 May 2022.