Skip to content

Instantly share code, notes, and snippets.

@mediaczar
Last active May 7, 2020 10:09
Show Gist options
  • Select an option

  • Save mediaczar/abd79dfe976b154e5808436ba3e66581 to your computer and use it in GitHub Desktop.

Select an option

Save mediaczar/abd79dfe976b154e5808436ba3e66581 to your computer and use it in GitHub Desktop.

BACKGROUND

We want to benchmark our LinkedIn Company Page against a series of our competitors. As part of this, we want to collect data on their posts, and the engagement on their posts.

From a list of LinkedIn Company Pages -- e.g. (link removed):

Collect posts and their metadata (date published, social actions on the post, the user IDs of people who has engaged with the post etc. -- a full list is published below)

We expect to collect at least 100 posts per page. Note that LinkedIn pages have an “infinite scroll” feature.

Please note: If successful, this may be the first of many similar jobs - we may hire more than one freelancer to work on the project simultaneously.

DATA TO COLLECT

Please see attached JSON “LinkedIn post model" or this schema for example models of the data.

PAGE OBJECT:

  • Page name
  • Page URL
  • Followers
  • Employees

POST OBJECT:

  • Post URL
  • Author (page URL)
  • Date (elapsed time since post. LinkedIn provides "fuzzy" dates -- e.g. "1d", "3d", "1w", "2m")
  • Content (complete message text)
  • Type of post (one of “post”, “shared article”, “shared post”, “linkedin video”, “external video”, “native document”)
  • Number of engagements
  • Number of comments

ENGAGER OBJECT

  • Post URL where the engagement occurs
  • Name of engager
  • Bio text of engager
  • Profile URL of engager
  • Type of engagement (one of “like”, “celebrate”, “love”, “insightful”, “curious”

COMMENT OBJECT

  • Post URL where the comment occurs
  • Name of commenter
  • Bio text of commenter
  • Profile URL of commenter
  • Comment (complete message text)
  • Date (elapsed time since comment. LinkedIn provides "fuzzy" dates -- e.g. "1d", "3d", "1w", "2m")
  • Comment URL

POST TYPE-SPECIFIC DATA

Depending on the post type (e.g. “shared article”, “linkedin video”) we’ll want to collect a little additional meta data. In each case, we’d collect all the data listed above, plus the specific data listed below. We’ve listed what we believe to be the “special post types” below.

SHARED ARTICLE Very common — a clickable rich media sharing card with share image, headline and subhead Shared article example

  • post URL
  • shared article URL (the URL behind the sharing card)
  • title (bold text)
  • subtitle (grey text - usually the root URL of the site where the shared article lives)

LINKEDIN VIDEO Fairly rare — a video has been uploaded to LinkedIn and appended to the post. It appears in an embedded viewer LinkedIn video example

  • post URL
  • LinkedIn video URL
  • views (integer)

EXTERNAL VIDEO Fairly rare — a video on an external platform (e.g. YouTube or Vimeo) has been linked from the post. It has much in common with the Shared Article type above. External video example

  • post URL
  • external video URL
  • title (bold text)
  • subtitle (grey text - usually the root URL of the site where the shared article lives)

NATIVE DOCUMENT Relatively rare — a document (e.g. PDF) has been uploaded to LinkedIn and appended to the post. It appears in an embedded viewer. Native document example

  • post URL
  • native document URL

SHARED POST Rare — the page has shared a post by another LinkedIn user as an embedded attachment to their own post. Shared post example

  • post URL
  • shared post URL
  • shared post author name
  • shared post author bio
  • shared post author profile URL
  • shared post content
  • shared post date (elapsed time since post. LinkedIn provides "fuzzy" dates -- e.g. "1d", "3d", "1w", "2m")

OUTPUT

Please supply data as SQLite files (or MySQL exports.) JSON is an acceptable alternative. We can discuss other alternatives; but note that we think flat file options are unlikely to suit the nature of the project.

Please supply scripts used (preferably Python 3.)

{
"page" : "Herbert Smith Freehills",
"page_url" : "https://www.linkedin.com/company/herbert-smith-freehills/",
"page_followers" : 105744,
"page_employees" : 5406,
"posts" :
[
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_herbertsmithfreehills-belfast-law-activity-6636221013833187329-ez-j",
"post_content" : "Our Belfast office recently partnered with The Prince's Trust on the Million Makers Challenge – a dragons’ den style fundraising challenge which aims to help young people across the UK.\n\nEarlier this week, the Belfast team was crowned 2020 Champions for the Northern Ireland and Scotland region, after exceeding their fundraising target of £10,000.\n\nCongratulations to those involved - Christopher Morgan, Eva Price, Lorenzo Campioni, Emma McIvor, Niall Shields, and Gemma Padden. The judging panel were highly complimentary of the professionalism and resilience the team demonstrated throughout the challenge. \n\n \nA special mention goes to Gemma Padden who won the Outstanding Individual contribution award for the region for her leadership skills. Our team go forward to the National Final and Awards Celebration in London in March 2020. We wish them every success!\n\n \n#herbertsmithfreehills #belfast #law #fundraising",
"post_date" : "4d",
"post_type" : "image post",
"post_engagement" :
{
"engager_count" : 47,
"engagers" :
[
{
"engager_name" : "Ryan Collins CMgr MCMI (ISO Specialist)",
"engager_bio" : "Managing Director @ New Paradigm Consulting",
"engager_URL" : "https://www.linkedin.com/in/ryancollinsconsultant/",
"engager_content" : "like"
},
{
"engager_name" : "Kathryn (McKernan) Totten",
"engager_bio" : "Senior Solicitor, Herbert Smith Freehills",
"engager_URL" : "https://www.linkedin.com/in/kathryn-totten-920916103/",
"engager_content" : "like"
},
{
"engager_name" : "Mary Doherty",
"engager_bio" : "Legal Professional",
"engager_URL" : "https://www.linkedin.com/in/mary-doherty-3ab80133/",
"engager_content" : "love"
},
{
"engager_name" : "David Sayce",
"engager_bio" : "Digital Marketing Consultant - Helping your business succeed ► Digital Strategy, SEO, SEM & Tech Audits ►► Available",
"engager_URL" : "https://www.linkedin.com/in/dsayce/",
"engager_content" : "celebrate"
}
]
},
"post_comments" :
{
"comment_count" : 2,
"commenters" :
[
{
"commenter_name" : "Sinéad Lunny LLB ACIM ANEA(Public Speaking) ANEA(Acting)",
"commenter_bio" : "Speaker, Event Host, Managing Director of Public Speaking NI, Voiceover Artist, Narrator, Actor",
"commenter_URL" : "https://www.linkedin.com/in/sin%C3%A9ad-lunny-llb-acim-anea-public-speaking-anea-acting-9422934a/",
"comment_content" : "Huge congratulations, Gemma Padden! What an achievement. Well done to the whole HSF Belfast team!",
"comment_date" : "4d",
"comment_URL" : "https://www.linkedin.com/feed/update/urn:li:activity:6636221013833187329?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6636221013833187329%2C6636230701412831232%29"
},
{
"commenter_name" : "Gemma Padden",
"commenter_bio" : "LD Adviser at Herbert Smith Freehills",
"commenter_URL" : "https://www.linkedin.com/in/gemma-padden-9847095a/",
"comment_date" : "4d",
"comment_URL" : "https://www.linkedin.com/feed/update/urn:li:activity:6636221013833187329?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6636221013833187329%2C6636230701412831232%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A6636221013833187329%2C6636232862930612224%29"
}
]
}
},
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_global-ma-trends-2020-ep4-deal-disruption-activity-6637652670167425024-ahjF",
"post_content" : "\"Preparation is key. It is worth bearing in mind that having a clear strategy for managing the anti-trust process should mean that parties can minimise the transaction timetable and that in turn should reduce the risk of disruption by others prior to completion of the M&A deal. If you can anticipate where the disruption may come from, it will be easier to cater for and so minimise its impact.\" Listen to this podcast to learn how. https://lnkd.in/dsAFspe\n\nAlex Kay, Kyriakos Fountoukakos\n#mergersandacquisitions",
"post_date" : "4d",
"post_type" : "shared article",
"shared_article" :
{
"shared_article_url" : "https://soundcloud.com/herbert-smith-freehills/global-ma-trends-2020-ep-4",
"shared_article_title" : "Global M&A Trends 2020 EP4: Deal disruption – the new normal by Herbert Smith Freehills Podcasts",
"shared_article_subtitle" : "soundcloud.com"
}
},
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_delighted-to-launch-our-annual-review-of-activity-6631201865449123841-o_R3/",
"post_content" : "\"Preparation is key. It is worth bearing in mind that having a clear strategy for managing the anti-trust process should mean that parties can minimise the transaction timetable and that in turn should reduce the risk of disruption by others prior to completion of the M&A deal. If you can anticipate where the disruption may come from, it will be easier to cater for and so minimise its impact.\" Listen to this podcast to learn how. https://lnkd.in/dsAFspe\n\nAlex Kay, Kyriakos Fountoukakos\n#mergersandacquisitions",
"post_date" : "2w",
"post_type" : "shared post",
"shared_post" :
{
"shared_post_url" : "https://www.linkedin.com/feed/update/urn:li:activity:6631201426552963072/",
"shared_post_author" : "Paul Lewis",
"shared_post_author_bio" : "Partner at Herbert Smith Freehills",
"shared_post_author_url" : "https://www.linkedin.com/in/paul-lewis-6306722a/",
"shared_post_content" : "Delighted to launch our Annual Review of legal developments in the Insurance sector to a packed audience at our London office this morning. For the full report, click here: \nhttps://lnkd.in/d2BivaQ",
"shared_post_date" : "2w"
}
},
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_intellectual-property-and-brexit-activity-6637653686795464704-Pfu5/",
"post_content" : "New Brexit Legal Guide section available: Brexit and intellectual property.\n\nThis provides a useful overview of how Brexit will impact a number of IP rights, including trade marks, copyright, geographical indications, patents and designs.\n\n#Brexit #FutureRelationship #TransitionPeriod #intellectualproperty #iplaw",
"post_date" : "3h",
"post_type" : "native document",
"native_document" :
{
"native_document_url" : "https://www.linkedin.com/feed/update/urn:li:activity:6631201426552963072/"
}
},
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_last-week-our-competition-regulation-and-activity-6626042729778425856-YSGQ/",
"post_content" : "Last week our Competition, Regulation and Trade team in Brussels were delighted to host the inaugural Cartels Workshop, in partnership with Concurrences which brought together members of the Brussels antitrust community to hear from panellists on key topics and developments in one of the main pillars of Competition law across Europe.\n\nFurther details of the conference please visit the event page (https://lnkd.in/deCZSg4). In this video Brussels Office Managing Partner Kyriakos Fountoukakos gives his take on the successful event. \n\nKyriakos Fountoukakos, Daniel Vowden, Josh Sherer, Gerald Miersch, Milan Kristof, James Baker, Bo Vesterdorf, Dirk Van Erps, Georgios Gryllos, Angélique de Brousse, Craig Earnshaw.",
"post_date" : "1mo",
"post_type" : "linkedin video",
"shared_linkedin_video" :
{
"linkedin_video_URL" : "https://dms.licdn.com/playlist/C4D05AQFoh62H8ClvDA/feedshare-captions-thumbnails-dualWrite-inhouse-mp4_h264_aac_1600k/0?e=1582639200&v=beta&t=7rVtZWKsHSqLbIMepaTbzkDXyOoVlbGggCVQa2pqGwY",
"linkedin_video_views" : 979
}
},
{
"post_url" : "https://www.linkedin.com/posts/herbert-smith-freehills_power-generation-is-at-the-heart-of-decarbonisation-activity-6633369120211554305-I235/",
"post_content" : "The European #GreenDeal sets very ambitious targets including a reduction of emission to 50-55% (compared to 1990 levels) and a mix of #renewable energy sources contributing to 57% of the total output.\n\nIn Italy the real issue is local permission proceedings. Indeed, there is a strong demand for green power but to achieve the 2030 national targets we must triple the current generation from photovoltaic and double that from wind. To achieve this the Italian Government must streamline the regulatory framework for permission and overcome local resistance. This requires a systematic approach and I believe that Italy is on the right trajectory.\" – Lorenzo Parola was interviewed by leading business journalist Mariangela Pira on February 11 on Sky TG24. \n\n Click here for the full interview (in Italian) - https://lnkd.in/euE_8-f",
"post_date" : "1w",
"post_type" : "external video",
"shared_external_video" :
{
"external_video_link" : "https://www.youtube.com/watch?v=WnnPwmhjIbI",
"external_video_title" : "Power generation is at the heart of decarbonisation and #greenenergy i…",
"external_video_subtitle" : "youtube.com"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment